[sqlite] Performance issue

Dominique Devienne Wed, 8 Apr 2015 13:57:56 +0200

On Wed, Apr 8, 2015 at 1:24 PM, R.Smith <rsmith at rsweb.co.za> wrote:

> On 2015-04-08 11:35 AM, Dominique Devienne wrote:
>
>> On Wed, Apr 8, 2015 at 11:16 AM, Jeff Roux <jfroux06 at gmail.com> wrote:
>>
>>> time echo 'SELECT ... FROM flows WHERE timestamp>=1383770600 AND \
>>>       timestamp<=1384770600 AND protocol IN (17, 6) GROUP BY portLan \
>>>       ORDER BY vol DESC LIMIT 6;' |
>>>       mysql testperf
>>>
>>

>  If you dropped the LIMIT 6 from both, are the results
>> significantly different?
>>
>

> No Dominique, it's not that -
>

Perhaps. But that still doesn't get to my point. With a LIMIT clause, in
such a GROUP BY ORDER BY returning a large result set, would SQLite:
1) sort the whole result-set and then keep only the first top-N rows?
2) or instead do a partial-sort of the first top-N rows only, as in
http://www.cplusplus.com/reference/algorithm/partial_sort/?

I'm interested in finding out for sure. Perhaps that's highjacking this
thread a bit, but in case of SQLite doing #1, and MySQL doing #2, it could
explain some of the difference. (although sorting a 1M array is so fast
nowadays, I doubt it.).

there must be another thing wrong with his setup. (He might not see my
> replies because he uses gmail).
>

Sure. I don't dispute that.

> In fact, the entire script, making the table, adding the index, populating
> it with a million rows (with computed values no less) and then doing the
> query plus posting the output - ALL of it takes less than 4 seconds
> together: (Try it yourself)
>

His rows are "fatter", since he mentioned 41 columns. Which might make it
go over some threshold(s) (page cache?) slowing things down once past it.

But indeed, sharing the DB (if not sensitive data) would be the way to go.
--DD

[sqlite] Performance issue

Reply via email to