Re: [sqlite] expected performance of selecting large numbers of records

2015-01-31 Thread Richard Hipp
On 1/31/15, Godmar Back  wrote:
> Thank you very much for those suggestions!
>
> Select max(rowid) completes immediately, I haven't tried WAL mode and the
> idea of sorting yet.
>
> https://www.sqlite.org/wal.html appears to be down right now, but
> http://www.sqlite.org/wal.html is up.
>

There was some server maintenance.  And for whatever reason, the HTTPS
side didn't come back up after reboot.  I've manually restarted it so
it should be working now.

For future reference, if www.sqlite.org is ever down, you can always
visit one of the (geographically dispersed) backup sites:

http://www2.sqlite.org/
http://www3.sqlite.org/

--
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] expected performance of selecting large numbers of records

2015-01-31 Thread Godmar Back
Thank you very much for those suggestions!

Select max(rowid) completes immediately, I haven't tried WAL mode and the
idea of sorting yet.

https://www.sqlite.org/wal.html appears to be down right now, but
http://www.sqlite.org/wal.html is up.

On Sat, Jan 31, 2015 at 5:31 AM, Igor Tandetnik  wrote:

>
> I'm not sure what you mean by "dump an approximate snapshot".


I meant that I don't care whether records that were recently added, say
during the last few seconds, are included in the total count()/dump of all
records or not. So I'm fine with an approximate snapshot that may not have
a few of the latest updates.

 - Godmar
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] expected performance of selecting large numbers of records

2015-01-31 Thread Igor Tandetnik

On 1/31/2015 12:21 AM, Godmar Back wrote:

If I want to do a dump of the table, as in "SELECT value FROM " I'll
quickly get "database is locked" errors. Googling revealed that those are
because a long running select keeps a cursor, and thus readlock on the
entire database. I have since rewritten the query using multiple SELECT *
FROM ... LIMIT a, b where b = 50 and a = 0, 50, 100, 150, .  However,
it takes 20 hours to fully extract the table's 850,000 records, with only
minimal per record processing.


That's a quadratic algorithm you have here. In order to start reading 
from the b-th record, SQLite needs to read and discard the previous b 
records. You would be better off with a "WHERE id > $LastID ORDER BY id 
" clause, where $LastID would be the key of the last record in the 
previous chunk.


Better still, switch to WAL mode ( https://www.sqlite.org/wal.html ) It 
allows readers to work simultaneously with a single writer. Then just 
run a single query across the whole table.



My question: is this performance expected, or am I doing something wrong?
Is there a quick way to count and/or dump an approximate snapshot of a
single 2-column table such as the one shown above, even while keeping the
database available for concurrent writes?


Do you ever delete records from this table? If you don't, you can get a 
count with


select max(rowid) from mytable;

That should be nearly instantaneous. If you do delete rows, and all you 
need is a quick count, then I would suggest adding an ON INSERT and ON 
DELETE triggers that would update a count stored in a separate, 
singleton table.


I'm not sure what you mean by "dump an approximate snapshot".
--
Igor Tandetnik

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] expected performance of selecting large numbers of records

2015-01-30 Thread Godmar Back
I have a single SQLite 2-column table with a primary TEXT key and a value,
like so (id TEXT PRIMARY KEY, value TEXT). One process adds new records
periodically, perhaps 1-10 per minute. The database currently has 850,000
entries and is 7.3GB large on disk.

I also need to perform bulk queries, which appear to be very slow. On an
average desktop PC, for instance, a "SELECT COUNT(*)" takes over 5 minutes.

If I want to do a dump of the table, as in "SELECT value FROM " I'll
quickly get "database is locked" errors. Googling revealed that those are
because a long running select keeps a cursor, and thus readlock on the
entire database. I have since rewritten the query using multiple SELECT *
FROM ... LIMIT a, b where b = 50 and a = 0, 50, 100, 150, .  However,
it takes 20 hours to fully extract the table's 850,000 records, with only
minimal per record processing.

My question: is this performance expected, or am I doing something wrong?
Is there a quick way to count and/or dump an approximate snapshot of a
single 2-column table such as the one shown above, even while keeping the
database available for concurrent writes?

I'm using Ubuntu 12.10 with 3.7.9-2ubuntu1.1 libsqlite3 using the Python
binding.

 - Godmar
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users