Am 30.05.2008 um 12:45 schrieb Jens Miltner:

> Hi,
>
> I've got a question and I don't know exactly whether the behavior is  
> standard or expected or not.
>
> Assuming the following schema and data:
>  CREATE TABLE t1 (id, name);
>  INSERT INTO t1 values (1, 'John');
>  INSERT INTO t1 values (2, 'Arnold');
>  INSERT INTO t1 values (3, 'Zork');
>
> We have some queries that do roughly this:
>
>  CREATE VIEW v1 AS SELECT id FROM t1 ORDER BY name;
>  CREATE TABLE t2 AS SELECT id FROM v1;
>  SELECT id from t2;
>
> This produces the ids in the expected order:
>
> 2
> 1
> 3
>
>
> However, when fetching DISTINCT values, the values are returned in  
> sorted order, not in the 'native' order in the table:
>
>  SELECT DISTINCT id from t2;
>
> 1
> 2
> 3
>
>
>
> Up until version 3.5.1 (well actually that was the last version  
> before 3.5.9 that we used), sqlite would return the distinct values  
> in the same order as in the query without the DISTINCT keyword.
>
> I'm suspecting that there is nothing that guarantees the order in  
> which results are returned unless I explicitely specify an "ORDER  
> BY" term, so this may be according to the standards, but it's  
> nonetheless a change that hurts us.
> What we did so far was to create a temporary table that contained  
> record IDs in a specific order, e.g.
>
>  CREATE TEMP TABLE foo AS SELECT record_id FROM bar WHERE <some  
> condition> ORDER BY lastname;
>
> then, after some further processing, we'd do something like
>
>  SELECT DISTINCT record_id FROM foo;
>
> to get the record IDs in the defined order and eventually something  
> like
>
>  SELECT lastname, firstname, contact_info
>  FROM foo
>  LEFT JOIN bar ON foo.record_id=bar.record_id
>  LEFT JOIN contacts ON contacts.bar_id=bar.record_id
>  WHERE foo.record_id=5;
>
> to get the information for a single record.
>
> The reasoning behind putting this sorted list of record IDs into a  
> temporary table is not really obvious from this example, but our  
> real code does a lot more processing and reuses some classes that  
> use temporary or real tables to store record selections. These are  
> mainly used as sets of records to display in lists for certain  
> [dynamically constructed] filter conditions. The display code will  
> then pick the record id and run a simple query to fetch the  
> displayed properties for that record. Since we can't cache the  
> information for all records in the displayed set (might be a really  
> large set initially), we came up with this approach to reduce the  
> size of the "cache" to the table containing just the record IDs.  
> However, in order to get a fast access, we have to rely on the order  
> of the record IDs in the table to match the order induced by the  
> "ORDER BY" expression...
>
> So, to cut a long story short: is this change in the returned order  
> for DISTINCT queries between version 3.5.1 and 3.5.9 a side effect  
> of switching the query engine? If so, is it an undesired side effect  
> or is it just something we have to live with?

As a follow-up:
One idea that occurred to me to work around this problem was to add an  
"ORDER BY rowid" to my "SELECT DISTINCT record_id ..." statement to  
enforce the "old" order, but apparently, one cannot order by rowid  
(although it is a pseudo-column that can be SELECTed)?

</jum>


_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to