On Wed, Mar 07, 2012 at 10:30:54PM +0000, Emmanuel MacCaull scratched on the wall: > Hi, > > I'm writing a custom aggregate function whose result depends on the sort > order of a subselect. It isn't clear whether this a good idea
Not a good idea. > and the documentation on group_concat(X) seems to suggest that the > order that rows are processed by an aggregate function is not > guaranteed ("The order of the concatenated elements is arbitrary." > -- http://www.sqlite.org/lang_aggfunc.html). That's true. > Given something like this: > > SELECT group_concat(name) > FROM (SELECT * FROM table ORDER BY name) > GROUP BY some_id; > > I understood the documentation of group_concat() as meaning that even > though there is an ORDER BY in the subselect, the order of the > concatenated names is not guaranteed. > > If that is the case, it seems logical that the restriction is only there > because the order in which the aggregate function processes each row is > arbitrary. Does anyone know if this is true or is it safe to assume that > the aggregate will access each row in order? Not so much "arbitrary" as undefined. If you run the same query on the same data using the same version of SQLite, I'm sure you'll get the same results. That said, you shouldn't depend on that. SQL tables themselves have no ordering... they are properly considered sets of rows (in the formal mathematical sense of "set"). Similarly, most working sets of data (relational variables) have no defined ordering. A *result* set may have an order, defined by an ORDER BY, but that order may not hold when the result of a sub-select is cast back into a data source (such as above). In short, the database engine is able to reorder anything it wants, at any time, except for the final ORDER BY. While a specific query may do what you expect (at least for this version of SQLite) things may change. Heck, just adding an index can alter the query plan and change orderings. SQLite even has a pragma to purposely re-order queries just to test for unintentional order dependencies. For example, most (but not all; and not all the time) database engines will implement a GROUP BY clause by sorting the rows according to the GROUP BY expressions. This puts all "like" rows next to each other, making it easy to collapse the groups into individual rows. There is no reason to assume that sorting process is stable, or that it will preserve the sub-select ordering in any way. In fact, I wouldn't be surprised to find out some query engines just ignores a "data source" sub-select's ORDER BY all together, since it doesn't make semantic sense. There are all kinds of additional optimizations that can open up. -j -- Jay A. Kreibich < J A Y @ K R E I B I.C H > "Intelligence is like underwear: it is important that you have it, but showing it to the wrong people has the tendency to make them feel uncomfortable." -- Angela Johnson _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users