[sqlite] Performance issue

Richard Hipp Mon, 30 Mar 2015 06:51:38 -0400

On Mon, Mar 30, 2015 at 6:44 AM, Jeff Roux <jfroux06 at gmail.com> wrote:


> Simon,
>
> here is the list of the indexes that were already defined on the table:
> CREATE INDEX idxLanWan ON flows(ipLan, ipWan, portWan, portLan);
> CREATE INDEX idxProto ON flows(protocol);
> CREATE INDEX idxTos ON flows(tos);
> CREATE INDEX idxTsLanWan ON flows(timestamp, ipLan, ipWan, portWan,
> portLan);
> CREATE INDEX tsindex ON flows(timestamp);
>

Recommend that you drop tsindex.  idxTsLanWan will work just as well.


>
> I added the ones you gave me:
> CREATE INDEX ts1 ON flows (portLan, protocol, timestamp);
> CREATE INDEX ts2 ON flows (portLan, timestamp);
> CREATE INDEX ts3 ON flows (protocol, portLan, timestamp);
> CREATE INDEX ts4 ON flows (protocol, timestamp);
>

If you decide to keep ts3 and/or ts4, then recommend that you drop idxProto.


>
>  and it appears it now uses the ts4 index, but the time spent is still ~ 5s
>

How many rows satisfy the WHERE clause?

You might get some relief using a covering index:

CREATE INDEX ts5 ON
flows(protocol,timestamp,portLan,nbBytesDecompOut,nbBytesCompln);


>
>
> sqlite> EXPLAIN QUERY PLAN SELECT SUM(nbBytesDecompOut+nbBytesCompIn) as
> vol, portLan as item FROM  flows  WHERE timestamp>=1383770600 AND
> timestamp<=1384770600 AND protocol IN (17, 6) GROUP BY portLan ORDER BY vol
> DESC LIMIT 6;
> 0|0|0|SEARCH TABLE flows USING INDEX ts4 (protocol=? AND timestamp>? AND
> timestamp<?) (~31250 rows)
> 0|0|0|EXECUTE LIST SUBQUERY 1
> 0|0|0|USE TEMP B-TREE FOR GROUP BY
> 0|0|0|USE TEMP B-TREE FOR ORDER BY
>
>
>
> 2015-03-30 12:20 GMT+02:00 Simon Slavin <slavins at bigfraud.org>:
>
> >
> > On 30 Mar 2015, at 10:46am, Jeff Roux <jfroux06 at gmail.com> wrote:
> >
> > > This daemon is running on a HP server with 12 cores, 32 GB of RAM,
> > > and a SSD drive. I have performance issues with some requests. For
> > > instance, the following request takes more than 5 seconds to
> > > accomplish with SQlite3 (in that particular case, the WHERE clause
> > > selects all the data in the database, i.e. 1000000 rows):
> > >
> > > SELECT SUM(nbBytesDecompOut + nbBytesCompIn) as vol, portLan as item
> > > FROM  flows
> > > WHERE timestamp>=1383770600 AND timestamp<=1384770600 AND protocol IN
> > (17, 6)
> > > GROUP BY portLan ORDER BY vol DESC LIMIT 6;
> >
> > What indexes do you have on that table ?  I would expect to get best
> > results from one of the following indexes
> >
> > (portLan, protocol, timestamp)
> > (portLan, timestamp)
> > (protocol, portLan, timestamp)
> > (protocol, timestamp)
> >
> > For best results, CREATE all of those indexes and any others that strike
> > your fancy, run ANALYZE, then execute your SELECT.  If this gives
> > satisfactory results, use
> >
> > EXPLAIN QUERY PLAN [your SELECT statement]
> >
> > and see which index SQLite decided to use.  You can then delete the other
> > unused indexes and run ANALYZE one final time for a little more
> > optimization.
> >
> > It's worth noting that SQLite, since it has no server, has no
> server-level
> > caching.  If you're doing repeated queries using mySQL, after the first
> one
> > most of the index is cached so the times you're seeing are faster than
> > SQLite can ever produce.  Two disadvantages of mySQL are the memory usage
> > and the work involved in administration.
> >
> > Simon.
> > _______________________________________________
> > sqlite-users mailing list
> > sqlite-users at mailinglists.sqlite.org
> > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
> >
> _______________________________________________
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>



-- 
D. Richard Hipp
drh at sqlite.org

[sqlite] Performance issue

Reply via email to