On Fri, 05 Jul 2013 12:38:37 +0200 Gabriel Corneanu <gabrielcorne...@gmail.com> wrote:
> About complexity: I'm not sure it's NlogN; for each N you need to > count N-1 columns, that's N^2 IMO. You're right if the data aren't sorted. If the data are sorted, to *find* the largest value smaller than a given value is O(log N), as you know. Counting can be minimized by an algorithm that avoids recounting. Consider the sequence A,B,C,D,E. We want to rank them, obtaining: A 0 B 1 C 2 D 3 E 4 We first count all how many are smaller than A, finding 0. We keep that number, because it will be useful later. To count those smaller than B, we first find the largest value smaller than B, i.e. A, and look up its count (0), and add 1 to that, producing 1, which we keep. Then we find B for C and add 1 to B's count, producing 2 for C, and so on. For every element N we have a lookup (log N) and a constant increment: O(N log N). > And you have an EXTRA temporary B-TREE? Doesn't it matter?? > Although I don't really understand why, it has an index on it. Sure it matters! It's not intrinsic to the operation. It represents an opportunity to improve the optimizer. > My original concern is indeed simplicity and efficiency. > But I'm NOT advocating any complex RANK function, but rather to > expose (as pseudo-data) the row index IN THE CURRENT RESULT SET; > while usually the result set is ordered, it doesn't really matter. > This information HAS NO other technical meaning (like rowid). Yes, and my answer to you is two-fold: 1. If you're only interested in the number of the row as delivered to the application, you can do that better in your application with i++ than SQLite can. In that case, both application and library have exactly the same information, and the computation is trivial. (Others on this thread say their favorite container library doesn't include a "row index" column. But that library is mated to SQLite somewhere, and that feature could be added there, perhaps in the callback function to sqlite3_exec().) 2. AIUI you suggested a SQL function that would produce the "row index". Such a function could be referenced in e.g. a WHERE clause, a very bad thing, whether or not is has "other technical meaning". In the relational model, there are no pointers and no pseudo-data. Everything is referenced by value. It was an intentional choice made for *convenience*: Codd recognized that no index has any meaning without an associated value. Eliminate the index and the value still stands, leaving the user with one less thing to think about. OTOH if removing the index does remove information, then the index is *data* and can be be represented in the database as data i.e., as a value. Something about your "row index" relates back to the data. The relational model has no way to represent data except as *values* Introducing a "row index" function into SQLite would provide complexity without power: another way to address the data, but no new way to express or derive anything about them. You can't have it both ways. Either the "row index" has no relationship to the data and therefore no reason to exist in the database, or it expresses something implied about the order and therefore is a value. Any example you give will fall into one of those two categories. --jkl _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users