On Fri, Apr 26, 2024 at 12:54 PM Tom Lane <t...@sss.pgh.pa.us> wrote: > But that's exactly the point: we DON'T consider the initial identifier > of a qualified name "as a unit", and neither does the standard. > We have to figure out how many of the identifiers name an object > (column or table) referenced in the query, and that is not clear > up-front. SQL99's rules don't make that any better. The parens in > our current notation serve to separate the object name from any field > selection(s) done on the object. There's still some ambiguity, > because we allow you to write either "(table.column).field" or > "(table).column.field", but we've dealt with that for ages.
I agree that this is exactly the point. No other programming language that I know of, and no other database that I know of, looks at x.y.z and says "ok, well first we have to figure out whether the object is named x or x.y or x.y.z, and then after that, we'll use whatever is left over as a field selector." Instead, they have a top-level namespace where x refers to one and only one thing, and then they look for something called y inside of that, and if that's a valid object then they look inside of that for z. JavaScript is probably the purest example of this. Everything is an object, and x.y just looks up 'x' in the object that is the current namespace. Assuming that returns an object rather than nothing, we then try to find 'y' inside of that object. I'm not an Oracle expert, but I am under the impression that the way that Oracle works is closer to that than it is to our read-the-tea-leaves approach. I'm almost positive you're about to tell me that there's no way in the infernal regions that we could make a semantics change of this magnitude, and maybe you're right. But I think our current approach is deeply unsatisfying and seriously counterintuitive. People do not get an error about x.y and think "oh, right, I need to write (x).y so that the parser understands that the name is x rather than x.y and the .y part is field-selection rather than a part of the name itself." They get an error about x.y and say "crap, I guess this syntax isn't supported" and then when you show them that "(x).y" fixes it, they say "why in the world does that fix it?" or "wow, that's dumb." Imagine if we made _ perform string concatenation but also continued to allow it as an identifier character. When we saw a_b without spaces, we'd test for whether there's an a_b variable, and/or whether there are a and b variables, to guess which interpretation was meant. I hope we would all agree that this would be insane language design. Yet that's essentially what we've done with period, and I don't think we can blame that on the SQL standard, because I don't think other systems have this problem. I wonder if anyone knows of another system that works like PostgreSQL in this regard (without sharing code). -- Robert Haas EDB: http://www.enterprisedb.com