Re: [HACKERS] resolving SQL ambiguity (was Re: WIP: Allow SQL-lang funcs to ref params by param name)

2011-03-26 Thread Darren Duncan

Pavel Stehule wrote:

sorry - I dislike this. The design is correct, but it is against to
SQL verbosity. A reader must to thinking about missing tablenames. I
dont't think so it is good solution, because it doesn't solve a
backing compatibility problem - somebody must to fix a function still,
and I think so it is much preferable to fix like:

  select (mytbl.mycol + myvar * myparam) as mynewcol from mytbl;

your proposal saves a five chars, but it has a negative impacts on
readability - there should be more tables.

There are no reason to introduce a new concepts - SQL knows a aliases.


Well, going forward, I know I would much rather have to say "mytbl.mycol" than 
have to say "myfunc.myparam".  And I certainly would want to expect that when 
one says "... as foo" that this "foo" is treated as a declaration unambiguously 
and is never substituted for some parameter or there be other grief as I seem to 
recall having in 8.4. -- Darren Duncan


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] resolving SQL ambiguity (was Re: WIP: Allow SQL-lang funcs to ref params by param name)

2011-03-26 Thread Pavel Stehule
2011/3/26 Darren Duncan :
> Pavel Stehule wrote:
>>
>> 2011/3/26 Darren Duncan :
>>>
>>> I mention 2 possible solutions here, both which involve syntax
>>> alterations,
>>> each between the -- lines.  I personally like the second/lower
>>> option more.
>>
>> -1
>>
>> this is not based on any pattern on SQL. It's not simple, and it
>> introduce a reserved keywords
>
> Okay, here's a much simpler proposal with the most important bit of the old
> one.
>
> 1.  In all situations where there is ambiguity such that an identifier
> reference (not declaration) may be referring to either a lexical
> variable/parameter of the current routine, or to the name of the table
> column of the contextually current table of the current SQL statement, the
> ambiguity is always resolved in favor of the lexical var/param.  If I am not
> mistaken, that is what PL/PgSQL already does since 9.0.
>
> 2.  If an identifier reference has a leading "." then that will force it to
> be interpreted as a column instead (and the code will fail if there is no
> such column), and so ".colname" is a shorthand for "tablename.colname"; but
> like with the old "colname" it only works when just 1 of the source tables
> has "colname" else it is still ambiguous like before.
>
> Example:
>
>    select (.mycol + myvar * myparam) as mynewcol from mytbl;
>
> This solution is a very terse and understandable change.
>
> There are no reserved keywords.  Legacy user code has no change where there
> were no conflicts before.  Legacy user code has no change in the case of
> conflict if it was previously resolved to favor the lexical var/param.
>
> Legacy user code only gains a leading "." in the few places where conflict
> was resolved in favor of a column name before where a same-named
> lexical/param existed.
>
> So what's not to like about this?

sorry - I dislike this. The design is correct, but it is against to
SQL verbosity. A reader must to thinking about missing tablenames. I
dont't think so it is good solution, because it doesn't solve a
backing compatibility problem - somebody must to fix a function still,
and I think so it is much preferable to fix like:

  select (mytbl.mycol + myvar * myparam) as mynewcol from mytbl;

your proposal saves a five chars, but it has a negative impacts on
readability - there should be more tables.

There are no reason to introduce a new concepts - SQL knows a aliases.

Regards

Pavel

>
> -- Darren Duncan
>

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] resolving SQL ambiguity (was Re: WIP: Allow SQL-lang funcs to ref params by param name)

2011-03-26 Thread Darren Duncan

Pavel Stehule wrote:

2011/3/26 Darren Duncan :

I mention 2 possible solutions here, both which involve syntax alterations,
each between the -- lines.  I personally like the second/lower
option more.


-1

this is not based on any pattern on SQL. It's not simple, and it
introduce a reserved keywords


Okay, here's a much simpler proposal with the most important bit of the old one.

1.  In all situations where there is ambiguity such that an identifier reference 
(not declaration) may be referring to either a lexical variable/parameter of the 
current routine, or to the name of the table column of the contextually current 
table of the current SQL statement, the ambiguity is always resolved in favor of 
the lexical var/param.  If I am not mistaken, that is what PL/PgSQL already does 
since 9.0.


2.  If an identifier reference has a leading "." then that will force it to be 
interpreted as a column instead (and the code will fail if there is no such 
column), and so ".colname" is a shorthand for "tablename.colname"; but like with 
the old "colname" it only works when just 1 of the source tables has "colname" 
else it is still ambiguous like before.


Example:

select (.mycol + myvar * myparam) as mynewcol from mytbl;

This solution is a very terse and understandable change.

There are no reserved keywords.  Legacy user code has no change where there were 
no conflicts before.  Legacy user code has no change in the case of conflict if 
it was previously resolved to favor the lexical var/param.


Legacy user code only gains a leading "." in the few places where conflict was 
resolved in favor of a column name before where a same-named lexical/param existed.


So what's not to like about this?

-- Darren Duncan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] resolving SQL ambiguity (was Re: WIP: Allow SQL-lang funcs to ref params by param name)

2011-03-26 Thread Pavel Stehule
2011/3/26 Darren Duncan :
> Robert Haas wrote:
>>
>> On Mar 25, 2011, at 9:22 PM, Joshua Berkus  wrote:
>>>
>>> Tom,
>>>
 Personally I'd vote for *not* having any such dangerous semantics as
 that. We should have learned better by now from plpgsql experience.
 I think the best idea is to throw error for ambiguous references,
 period.
>>>
>>> As a likely heavy user of this feature, I agree with Tom here.  I really
>>> don't want the column being silently preferred in SQL functions, when
>>> PL/pgSQL functions are throwing an error.  I'd end up spending hours
>>> debugging this.
>>>
>>> Also, I don't understand why this would be a dump/reload issue if $1 and
>>> $2 continue to work.
>>
>> Because an identifier that previously referred unambiguously to a column
>> might now be ambiguous, if there is a parameter with the same name.
>
> I mention 2 possible solutions here, both which involve syntax alterations,
> each between the -- lines.  I personally like the second/lower
> option more.
>
> 
>
> Might it be reasonable, perhaps as a 9.2 feature, to add
> top-level-namespaces so that one could always explicitly qualify what they
> are referring to?
>
> For example, you could have the 3 "sch", "lex", "attr" (I may have missed
> some useful ones).
>
> The "sch" TLN would unambiguously refer directly to a schema object, such as
> a database table.
>
> The "lex" TLN would unambiguously refer directly to a lexical, either a
> parameter of the current routine or to a lexical variable.
>
> The "attr" TLN would unambiguously refer to a table/etc column/attribute in
> the manner typical for SQL.
>
> Use them like:
>
>  sch.foo - the table/etc foo
>  lex.foo - the lexical variable foo
>  attr.foo - the column foo
>
> Use of these TLN are optional where there is no ambiguity.
>
> The TLN are not reserved words, but if one has an entity named the same,
> then references to it must be TLN-qualified; eg:
>
>  lex.sch
>  lex.lex
>  lex.attr
>
> Now these are just examples.  You may find a different set works better.

-1

this is not based on any pattern on SQL. It's not simple, and it
introduce a reserved keywords

Regards

Pavel

>
> --
>
> There are also alternate solutions.
>
> For example, it could be mandated that lexical-scope aliases for any
> data/var-like schema object are required in routines, where the aliases are
> distinct from all lexical vars/params/etc, and then all SQL/code in the
> routines may only refer to the schema objects by the aliases.
>
> Effectively this makes it so that routines can no longer see non-lexical
> vars but for those from parameters, and this aliasing is defining a
> parameter whose argument is supplied by the DBMS automatically rather than
> as an explicit routine caller argument.
>
> That way, inside a routine body there are only lexical names for things, and
> so no namespace-qualification is ever needed by the regular SQL.
>
> Similarly, if you always think of table column names as referring to an
> attribute or element of a table variable, then just reference the column
> qualified by the table name (or the lexical alias thereof).  Same as you do
> in any other programming language.  Of course, sometimes you don't have to
> qualify column name references as context could make it unambiguous.  Or, a
> shorthand like a simple leading "." could unambiguously say you're referring
> to a column of the particular table in context.
>
> With those in place, all unqualified references are straight to lexical
> variables or parameters.
>
> And so, this is also an effective way to resolve the ambiguity and I prefer
> the latter design personally.
>
> Here's an example in quasi-PL/PgSQL:
>
>  create function myrtn (myparam integer, mytbl ::= mydb.myschema.mytbl) as
>  declare
>      myvar integer := 5;
>  $body$
>  begin
>      select (.mycol + myvar * myparam) as mynewcol from mytbl;
>  end;
>  $body$
>
> Note that I've already thought through this last example as these methods of
> avoiding ambiguity are loosely-speaking how my language Muldis D avoids the
> problem faced by many SQL procedures.
>
> The ".mycol" syntax specifically was inspired originally for me by Perl 6
> where the lack of something just before the "." means that the implicit
> topic variable is referred to, like if you said "$_.mycol".
>
> A Perl 6 analogy being something like:
>
>  $mytbl.map:{ .mycol + $myvar * $myparam }
>
> aka:
>
>  $mytbl.map:{ $_.mycol + $myvar * $myparam }
>
> --
>
> -- Darren Duncan
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] resolving SQL ambiguity (was Re: WIP: Allow SQL-lang funcs to ref params by param name)

2011-03-25 Thread Darren Duncan

Robert Haas wrote:

On Mar 25, 2011, at 9:22 PM, Joshua Berkus  wrote:

Tom,


Personally I'd vote for *not* having any such dangerous semantics as
that. We should have learned better by now from plpgsql experience.
I think the best idea is to throw error for ambiguous references,
period. 

As a likely heavy user of this feature, I agree with Tom here.  I really don't 
want the column being silently preferred in SQL functions, when PL/pgSQL 
functions are throwing an error.  I'd end up spending hours debugging this.

Also, I don't understand why this would be a dump/reload issue if $1 and $2 
continue to work.


Because an identifier that previously referred unambiguously to a column might 
now be ambiguous, if there is a parameter with the same name.


I mention 2 possible solutions here, both which involve syntax alterations, each 
between the -- lines.  I personally like the second/lower option more.




Might it be reasonable, perhaps as a 9.2 feature, to add top-level-namespaces so 
that one could always explicitly qualify what they are referring to?


For example, you could have the 3 "sch", "lex", "attr" (I may have missed some 
useful ones).


The "sch" TLN would unambiguously refer directly to a schema object, such as a 
database table.


The "lex" TLN would unambiguously refer directly to a lexical, either a 
parameter of the current routine or to a lexical variable.


The "attr" TLN would unambiguously refer to a table/etc column/attribute in the 
manner typical for SQL.


Use them like:

  sch.foo - the table/etc foo
  lex.foo - the lexical variable foo
  attr.foo - the column foo

Use of these TLN are optional where there is no ambiguity.

The TLN are not reserved words, but if one has an entity named the same, then 
references to it must be TLN-qualified; eg:


  lex.sch
  lex.lex
  lex.attr

Now these are just examples.  You may find a different set works better.

--

There are also alternate solutions.

For example, it could be mandated that lexical-scope aliases for any 
data/var-like schema object are required in routines, where the aliases are 
distinct from all lexical vars/params/etc, and then all SQL/code in the routines 
may only refer to the schema objects by the aliases.


Effectively this makes it so that routines can no longer see non-lexical vars 
but for those from parameters, and this aliasing is defining a parameter whose 
argument is supplied by the DBMS automatically rather than as an explicit 
routine caller argument.


That way, inside a routine body there are only lexical names for things, and so 
no namespace-qualification is ever needed by the regular SQL.


Similarly, if you always think of table column names as referring to an 
attribute or element of a table variable, then just reference the column 
qualified by the table name (or the lexical alias thereof).  Same as you do in 
any other programming language.  Of course, sometimes you don't have to qualify 
column name references as context could make it unambiguous.  Or, a shorthand 
like a simple leading "." could unambiguously say you're referring to a column 
of the particular table in context.


With those in place, all unqualified references are straight to lexical 
variables or parameters.


And so, this is also an effective way to resolve the ambiguity and I prefer the 
latter design personally.


Here's an example in quasi-PL/PgSQL:

  create function myrtn (myparam integer, mytbl ::= mydb.myschema.mytbl) as
  declare
  myvar integer := 5;
  $body$
  begin
  select (.mycol + myvar * myparam) as mynewcol from mytbl;
  end;
  $body$

Note that I've already thought through this last example as these methods of 
avoiding ambiguity are loosely-speaking how my language Muldis D avoids the 
problem faced by many SQL procedures.


The ".mycol" syntax specifically was inspired originally for me by Perl 6 where 
the lack of something just before the "." means that the implicit topic variable 
is referred to, like if you said "$_.mycol".


A Perl 6 analogy being something like:

  $mytbl.map:{ .mycol + $myvar * $myparam }

aka:

  $mytbl.map:{ $_.mycol + $myvar * $myparam }

--

-- Darren Duncan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers