Hello all, How can I speed up the sorting of a union all query?
Here are two queries which effectively create the same output. However the first one pipes the output from the union all to the shell's sort command whereas the second one used an order by clause. In addition, the second query explodes in the amount of time it takes for large data sets, becoming unusable after about 500,000 records. In contrast, the first is still reasonable even past several million records. $ time -p { grep -v ^# <<eof select amp1, tag, rowid from foo union all select amp2, tag, rowid from foo ; eof } | sqlite3 primers.db | sort -k1,1n -k2,2n | head -2 1|-1147116725663304346|67713 1|-1253798817002555285|42783 real 4.33 user 4.33 sys 0.13 $ time -p { grep -v ^# <<eof select amp1 as "as1", tag, rowid from foo union all select amp2 as "as1", tag, rowid from foo order by as1, tag ; eof } | sqlite3 primers.db | head -2 1|-9220010096774152842|117488 1|-8636788692624495913|191979 real 12.64 user 7.44 sys 2.16 $ sqlite3 primers.db 'select count(*) from foo ; ' 200000 The output for 500,000 records: # first query 1|-1147116725663304346|67713 1|-1253798817002555285|42783 real 12.26 user 10.99 sys 0.66 # second query 1|-9220010096774152842|117488 1|-8636788692624495913|191979 real 62.82 user 20.22 sys 7.08 I am guessing that some kind of index should be built on the fields in table foo. But which? I've tried the queries with indexes on amp1, amp2, tag, (amp1+tag), (amp2+tag), (amp1+amp2+tag). None show a performance increase, which makes me believe that either I'm not building the correct index or that the bottleneck is not the index. How can I go about finding where the bottleneck is? Thanks in advance for any pointers. Regards, - Robert _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users