On 2015/01/26 04:04, James K. Lowden wrote:
On Sun, 25 Jan 2015 23:18:05 +0200
RSmith <rsm...@rsweb.co.za> wrote:

There is no documentation in either SQLite or the SQL standard
that would lead anyone to believe that behavior is expected - in fact
it is very clear about the returned column names being
non-deterministic if not explicitly requested by aliasing.
I no longer think that is entirely true.  In SQLite's documentation,
it's not in the description of the SELECT statement that I can find
(http://www.sqlite.org/lang_select.html).  It is mentioned in the
sqlite3_column_name description, so, OK, it's not a bug.

But I'm going to say that a function that "returns the name" of a
column really should return the column's *name*.  Especially if that
name would suffice to address the column syntactically if the form
statement were wrapped one more level, i.e.:

        select * from (<select statement>) as T

Otherwise there's weird mismatch between the names SQL itself uses and
the ones presented to the user.

You are of course correct in feeling this way from a practical point of view, and I daresay SQLite (and the other engines) tries to return useful colmn naming where the cost of deduction does not impact performance. In fact, I believe it is this specific thing that lulls users or programmers into believing that the column naming returned by the engine for non-aliased columns is somehow governed by special rules or format and can therefore be trusted to be always conformant - where they really shouldn't.

Implementing a specific methodology for returning useful column naming based maybe on column references (as per your suggestion), I am not quite comfortable with. Forgetting for the moment any calculated columns and other obvious column-reference/naming problems, even in straight-forward reference schemes I think there will be difficulties - I can think of tables joined to themselves (maybe even more than once, a useful trick as you know), multiple tables joined where some columns appear in both but others not and other possibilities that might come to mind if more time is spent on it.

Understand, I do not think these are insurmountable problems, but two questions 
arise:
- Who decides the rules for handling it so that it may become "trusted" by DB users/admins/programmers, if not the SQL standard? - and, - What cost to performance will be acceptable? (i.e. how many cpu cycles might be spent to finding suitable column names when they are not specifically requested?)

Add to this the fact that for most queries where result column headers are not aliased to specifically required items, the actual naming just doesn't matter - and if it does,
   A - you are doing it wrong (which isn't a new rule), and
   B - It is so easy to add an alias request if you need an exact name.


I do understand that there is a small but important category of systems where this is paramount, like people making DB tools or admin tools which try to interpret data from user-specified queries, where random returned column naming is detrimental (I'm myself troubled by this), but I do not think the engines should cater for this if it comes at any sort of cost.


_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to