Am 30.05.2008 um 12:45 schrieb Jens Miltner: > Hi, > > I've got a question and I don't know exactly whether the behavior is > standard or expected or not. > > Assuming the following schema and data: > CREATE TABLE t1 (id, name); > INSERT INTO t1 values (1, 'John'); > INSERT INTO t1 values (2, 'Arnold'); > INSERT INTO t1 values (3, 'Zork'); > > We have some queries that do roughly this: > > CREATE VIEW v1 AS SELECT id FROM t1 ORDER BY name; > CREATE TABLE t2 AS SELECT id FROM v1; > SELECT id from t2; > > This produces the ids in the expected order: > > 2 > 1 > 3 > > > However, when fetching DISTINCT values, the values are returned in > sorted order, not in the 'native' order in the table: > > SELECT DISTINCT id from t2; > > 1 > 2 > 3 > > > > Up until version 3.5.1 (well actually that was the last version > before 3.5.9 that we used), sqlite would return the distinct values > in the same order as in the query without the DISTINCT keyword. > > I'm suspecting that there is nothing that guarantees the order in > which results are returned unless I explicitely specify an "ORDER > BY" term, so this may be according to the standards, but it's > nonetheless a change that hurts us. > What we did so far was to create a temporary table that contained > record IDs in a specific order, e.g. > > CREATE TEMP TABLE foo AS SELECT record_id FROM bar WHERE <some > condition> ORDER BY lastname; > > then, after some further processing, we'd do something like > > SELECT DISTINCT record_id FROM foo; > > to get the record IDs in the defined order and eventually something > like > > SELECT lastname, firstname, contact_info > FROM foo > LEFT JOIN bar ON foo.record_id=bar.record_id > LEFT JOIN contacts ON contacts.bar_id=bar.record_id > WHERE foo.record_id=5; > > to get the information for a single record. > > The reasoning behind putting this sorted list of record IDs into a > temporary table is not really obvious from this example, but our > real code does a lot more processing and reuses some classes that > use temporary or real tables to store record selections. These are > mainly used as sets of records to display in lists for certain > [dynamically constructed] filter conditions. The display code will > then pick the record id and run a simple query to fetch the > displayed properties for that record. Since we can't cache the > information for all records in the displayed set (might be a really > large set initially), we came up with this approach to reduce the > size of the "cache" to the table containing just the record IDs. > However, in order to get a fast access, we have to rely on the order > of the record IDs in the table to match the order induced by the > "ORDER BY" expression... > > So, to cut a long story short: is this change in the returned order > for DISTINCT queries between version 3.5.1 and 3.5.9 a side effect > of switching the query engine? If so, is it an undesired side effect > or is it just something we have to live with?
As a follow-up: One idea that occurred to me to work around this problem was to add an "ORDER BY rowid" to my "SELECT DISTINCT record_id ..." statement to enforce the "old" order, but apparently, one cannot order by rowid (although it is a pseudo-column that can be SELECTed)? </jum> _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users