Re: [sqlite] Foreign Key constraint problem while dropping tables inside transaction

2013-05-13 Thread Григорий Григоренко

Понедельник, 13 мая 2013, 17:03 +01:00 от Simon Davies 
<simon.james.dav...@gmail.com>:
>On 13 May 2013 16:52, Simon Slavin < slav...@bigfraud.org > wrote:
>>
>> On 13 May 2013, at 3:54pm, Григорий Григоренко < grigore...@mail.ru > wrote:
>>
>>> sample database is:
>>>
>>> PRAGMA FOREIGN_KEYS=1;
>>> CREATE TABLE cat(id INTEGER PRIMARY KEY, name);
>>> INSERT INTO cat VALUES (1, 'Alice');
>>> CREATE TABLE owner(pet INTEGER REFERENCES cat(id));
>>> INSERT INTO owner VALUES(1);
>>>
>>> This script fails to drop tables with  'foreign key constraint failed':
>>>
>>> SAVEPOINT edit;
>>> PRAGMA FOREIGN_KEYS=0;
>>> DROP TABLE cat;
>>> DROP TABLE owner;
>>> RELEASE edit;
>>> PRAGMA FOREIGN_KEYS=1;
>>
>> You are DROPping your tables in the wrong order.  The 'owner' table refers 
>> to the 'cat' table.  If you DROP 'cat' you would be left with an 'owner' 
>> table which refers to a table which doesn't exist.  Swap the order of the 
>> DROPs and your database will not risk that kind of disaster.
>>
>>> This script works OK:
>>>
>>> PRAGMA FOREIGN_KEYS=0;
>>> DROP TABLE cat;
>>> DROP TABLE owner;
>>> PRAGMA FOREIGN_KEYS=1;
>>
>> Your problem with the one with SAVEPOINT is that SQLite doesn't know when 
>> you're going to do the RELEASE.  If you left out the second DROP, then 
>> issued a RELEASE, the database would be corrupt, so SQLite issues the error 
>> message to warn you about it.  The version without the SAVEPOINT never has 
>> to worry about you doing that.
>>
>> Simon.
>
>http://www.sqlite.org/pragma.html#pragma_foreign_keys :
>
>"...foreign key constraint enforcement may only be enabled or disabled
>when there is no pending BEGIN or SAVEPOINT. "

Ups. My fault, must have read docs more carefully.
Thank you.

>
>Regards,
>Simon
>___
>sqlite-users mailing list
>sqlite-users@sqlite.org
>http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Foreign Key constraint problem while dropping tables inside transaction

2013-05-13 Thread Григорий Григоренко
 Hi,

sample database is:

PRAGMA FOREIGN_KEYS=1;
CREATE TABLE cat(id INTEGER PRIMARY KEY, name); 
INSERT INTO cat VALUES (1, 'Alice');
CREATE TABLE owner(pet INTEGER REFERENCES cat(id)); 
INSERT INTO owner VALUES(1);

This script fails to drop tables with  'foreign key constraint failed':

SAVEPOINT edit;
PRAGMA FOREIGN_KEYS=0;
DROP TABLE cat;
DROP TABLE owner;
RELEASE edit; 
PRAGMA FOREIGN_KEYS=1;

This script works OK:

PRAGMA FOREIGN_KEYS=0;
DROP TABLE cat;
DROP TABLE owner;
PRAGMA FOREIGN_KEYS=1;


Why?

--  Григорий Григоренко
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] sqlite_interrupt() does not cancel ATTACHing of blocked database

2013-04-29 Thread Григорий Григоренко
 Hi,

seems like sqlite_interrupt() does not cancel ATTACHing of blocked database. 

Steps to reproduce problem:
 1. open 'foo.db' in shell,  run 'BEGIN EXCLUSIVE';
 2. switch to custom application and open ":memory:" database in it; set busy 
timeout to 15 seconds and run "ATTACH 'foo.db' as db" in thread A;
 3. wait 2-3 seconds and call sqlite_interrupt() in 
thread B;
 4. nothing happens: application still waits 15 seconds and receives 
SQLITE_BUSY error.

SQLite should immediately stop waiting for database (upon sqlite_interrupt() 
call) and return with SQLITE_INTERRUPT error, perhaps?

Platform: Windows 7 x64, SQLite 3.7.16.2

-- 
Григорий Григоренко
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Please help. Read only process being blocked by writer process.

2012-11-28 Thread Григорий Григоренко
Hi,


CREATE TABLE rev(current); INSERT INTO rev VALUES(0);
CREATE TABLE data(..., revision);

Readers:
SELECT * FROM data JOIN rev ON revision = current WHERE ... ; 
// or "SELECT current FROM rev" into var and passing it value in "SELECT * FROM 
data WHERE revision=?"

Writer:

// insert new records, old records remains

SELECT current + 1 FROM rev; // into some variable

BEGIN;
INSERT INTO data(..., revision) VALUES (..., current );
INSERT INTO data(..., revision) VALUES (..., current );
...
... repeat 1000 times
...
COMMIT;

BEGIN;
INSERT INTO data(..., revision) VALUES (..., current );
INSERT INTO data(..., revision) VALUES (..., current );
...
... repeat 1000 times
...
COMMIT;

// repeat inserting until all records are inserted; 1000 is a number of records 
taken by a wild guess )

// now switch readers to new records
BEGIN; UPDATE rev SET current = current + 1; COMMIT; // no need for a 
begin/commit just to point that this runs inside its own transaction

// now delete old records again incrementally

// repeat this block until records stop deleting from table
BEGIN;
SELECT Max(rowid) - 1000 FROM data; // into variable MaxId
DELETE FROM data WHERE revision = current - 1 AND rowid > MaxId;
COMMIT;

// done, there are only new records in a table, repeat above steps to insert 
new bunch of records




Regads,
GG


Wed 28 Nov 2012 09:47:50 от Alejandro Martínez :
>   
>
>


>



>I have one process that each 30 minutes refills several tables in this
>
manner:
>

>
sqlite3_open_v2(CACHEDB_PATH, _conn, SQLITE_OPEN_CREATE |
>
SQLITE_OPEN_READWRITE, NULL)
>

>
- For each table:
>

>
begin deferred transaction; delete from [table];
>
insert into table ...
>
insert into table ...
>
insert into table ...
>
insert into table ...
>
[up to 180.000 rows]
>
commit;
>

>
and sometimes the commit fails, so it is retried. (why would it fail? its
>
the only  writter)
>

>
And then i have many other processes that open that sqlite database read
>
only. sqlite3_open_v2(_dbfile, _conn, SQLITE_OPEN_READONLY, NULL)
>
and sqlite3_busy_timeout(sqcache_conn, 5000)
>

>
These processes create very simple prepared statements to query that tables.
>

>
And the big problem i'm having, is that when i step these prepared
>
statements, they lock for 5 seconds and then fail.
>

>
And i put that busy timeout just for completeness, cause i wasn't expecting
>
any locking because for being a read only query.
>

>
I really need these queries not to lock or fail.
>

>
What am i doing wrong?
>
Any suggestions?
>

>
Thank you,
>
Alejandro
>
___
>
sqlite-users mailing list
>
>sqlite-users@sqlite.org
>
>http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
>   





>

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SUGGESTION: "now" as alias for "strftime('%s','now')"

2012-11-04 Thread Григорий Григоренко



Fri, 2 Nov 2012 14:11:26 + от "Black, Michael (IS)" 
<michael.bla...@ngc.com>:
>CREATE TABLE t(id,time);
>
INSERT INTO t VALUES(1,CURRENT_DATE);
>
INSERT INTO t VALUES(2,CURRENT_TIMESTAMP);
>
INSERT INTO t VALUES(3,datetime('now'));
>
INSERT INTO t VALUES(4,date('now'));
>
SELECT * FROM t;
>
1|2012-11-02
>
2|2012-11-02 14:10:15
>
3|2012-11-02 14:10:15
>
4|2012-11-02
>
>
Perhaps the documentation needs to be better?  Apparently you couldn't find 
this info...
Indeed, I was never aware of CURRENT_*.

Anyway, all these functions return current moment as _string_ and this is not a 
great way to store datetime in db, isn't it?
- more memory occupied;
- slower compare;
- cannot add & substract;
etc.




>
>
Michael D. Black
>
Senior Scientist
>
Advanced Analytics Directorate
>
Advanced GEOINT Solutions Operating Unit
>
Northrop Grumman Information Systems
>
>

>
From: sqlite-users-boun...@sqlite.org [sqlite-users-boun...@sqlite.org] on 
behalf of Григорий Григоренко [grigore...@mail.ru]
>
Sent: Friday, November 02, 2012 8:08 AM
>
To: General Discussion of SQLite Database
>
Subject: EXT :Re: [sqlite]SUGGESTION: "now" as alias for "strftime('%s','now')"
>
>
Thu, 1 Nov 2012 19:57:42 + от Simon Slavin <slav...@bigfraud.org>:
>
>
>
>
On 1 Nov 2012, at 7:55pm, Григорий Григоренко <grigore...@mail.ru> wrote:
>
>
>
>
>
> it is a common practice to store datetime values as UNIX time UTC.
>
>
>
> Maybe, Sqlite should have some shortcut for evaluating current moment?
>
>
>
>
>
Please read
>
>
>
>
>
<http://www.sqlite.org/lang_datefunc.html>
>
>
>
Surely, I did) Don't get me wrong - my point is not that Sqlite is lacking 
functions that modify or format date values.
>
>
It's about having useful shortcut for getting current moment that doesn't have 
(string) parameters and so can be easily remembered and typed.
>
>
Compare:
>
MS SQL: CURRENT_TIMESTAMP
>
PostgreSQL: now()
>
Oracle: sysdate
>
>
To:
>
Sqlite: strftime('%s','now')
>
>
>
>Simon.
>
___
>
sqlite-users mailing list
>sqlite-users@sqlite.org
>http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SUGGESTION: "now" as alias for "strftime('%s','now')"

2012-11-04 Thread Григорий Григоренко


Fri, 2 Nov 2012 10:25:18 -0400 от Richard Hipp :

>
>
>On Fri, Nov 2, 2012 at 10:18 AM, Simon Davies  
>wrote:
>

datetime() will give current date and time
>>
likewise date(), time() etc
>>
>If I read the original post correctly, I think the OP is requesting a new 
>function, "now()", that returns the number of seconds since 1970.  This would 
>be the same as "CAST(strftime('%s','now') AS INTEGER)", just easier to 
>remember.
Correct!


>
>By coincidence, such a function has already been added to the SQLite 
>implementation inside of Fossil.  See:
>
>http://www.fossil-scm.org/fossil/artifact/b743628236c?ln=669-679
>http://www.fossil-scm.org/fossil/artifact/b743628236c?ln=718
>
>Perhaps the OP can simply copy the code above into his own application?
>
I'm using Delphi with sqlite3.dll, maybe I should introduce my own function.






>
>-- 
>D. Richard Hipp
>d...@sqlite.org
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SUGGESTION: "now" as alias for "strftime('%s','now')"

2012-11-04 Thread Григорий Григоренко



Fri, 02 Nov 2012 15:32:44 +0100 от Clemens Ladisch :
>Igor Tandetnik wrote:
>
> SQLite does in fact accept CURRENT_TIMESTAMP in DEFAULT clause.
>
>
>
> http://sqlite.org/lang_createtable.html#tablecoldef
>
>
SQLite does in fact accept CURRENT_TIMESTAMP anywhere:
>http://www.sqlite.org/lang_expr.html
>
>
Yes, indeed. Perhaps, this should be 
on http://www.sqlite.org/lang_datefunc.html page.





>
Regards,
>
Clemens
>
___
>
sqlite-users mailing list
>sqlite-users@sqlite.org
>http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SUGGESTION: "now" as alias for "strftime('%s','now')"

2012-11-02 Thread Григорий Григоренко



Thu, 1 Nov 2012 19:57:42 + от Simon Slavin <slav...@bigfraud.org>:

>
On 1 Nov 2012, at 7:55pm, Григорий Григоренко <grigore...@mail.ru> wrote:
>
>
> it is a common practice to store datetime values as UNIX time UTC. 
>
> Maybe, Sqlite should have some shortcut for evaluating current moment?
>
>
Please read
>
>
<http://www.sqlite.org/lang_datefunc.html>
>
Surely, I did) Don't get me wrong - my point is not that Sqlite is lacking 
functions that modify or format date values.  

It's about having useful shortcut for getting current moment that doesn't have 
(string) parameters and so can be easily remembered and typed.

Compare:
MS SQL: CURRENT_TIMESTAMP
PostgreSQL: now()
Oracle: sysdate

To:
Sqlite: strftime('%s','now')


>Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] SUGGESTION: "now" as alias for "strftime('%s','now')"

2012-11-01 Thread Григорий Григоренко
Hello,

it is a common practice to store datetime values as UNIX time UTC. 
Maybe, Sqlite should have some shortcut for evaluating current moment?
Some alias for strftime('%s','now') ? Like, "now" or "unixnow":

  created_at   DEFAULT  (now())     vs     created_at    DEFAULT  
(strftime('%s','now'))

It is shorter and more important - it can be written without consulting docs 
for strfime() , I cannot remember parameters  (


Regards,
Grigory Grigorenko 
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] A question about prepared statements

2012-10-21 Thread Григорий Григоренко



Fri, 19 Oct 2012 13:16:31 -0700 от Pavel Ivanov <paiva...@gmail.com>:
>>> Var Alice := Db.SelectRecord('SELECT * FROM client WHERE client_name = ?', 
>>> [  'alice' ]);
>
>>
>
>> 1. If I drop record for "alice" from db and then access column data in 
>> prepared stmt will it work OK?
>
>
>
> If the select statement wasn't reset or finalized, you won't be able to 
> delete a record from the database (or, in WAL journaling mode, you would be 
> able to, but the reader would still see original data).
>
>
Well, if he uses the same connection used to prepare the statement,
>
then he will be able to delete record even in normal journaling mode.
>
What happens in this case is undefined. It might be that column data
>
will be still accessbile, it might be you get some garbage, it might
>
be an access violation.
Using same connection means using same transaction, right?

Let's say app does the following (via the same connection):
1. BEGIN TRANSACTION;
2. prepare and step stmt A (SELECT * FROM foo WHERE rowid=1), A is not 
finalized or reset;
3. execute stmt B (DELETE FROM foo WHERE rowid=1), B is prepared, executed and 
finalized;
4. access column data via stmt A
5. COMMIT

A record with rowid=1 is actually deleted from db on step 5, right?

Does accessing record data via stmt A (step 4)  "break any rules"? 



>
>
>> 2. Can I drop client table while having such prepared stmt not closed?
>
>
I think table client cannot be dropped altogether until all statements
>
using it are reset/finalized.
>
>
>
Pavel
>
>
>
On Fri, Oct 19, 2012 at 12:49 PM, Igor Tandetnik <itandet...@mvps.org> wrote:
>
> Григорий Григоренко <grigore...@mail.ru> wrote:
>
>> I am using prepared stmts to cache selected record for later use. I have a 
>> wrapper class, that has methods like AsString(name),
>
>> AsFloat(name) etc to extract data for a column name.
>
>>
>
>> I prepare, bind and then do Sqlite3_step() to get record. Later I use 
>> sqlite3_column_XXX() to access data.
>
>>
>
>> My concern is - do open selective prepared stmts depend on something in db 
>> or lock something in db?
>
>
>
> Yes, until you call sqlite3_reset or sqlite3_finalize. For as long as the 
> statement is "active" (meaning, sqlite3_reset or sqlite3_finalize has not 
> been called after the most recent sqlite3_step call), it holds a read 
> transaction open.
>
>
>
>> Var Alice := Db.SelectRecord('SELECT * FROM client WHERE client_name = ?', [ 
>>  'alice' ]);
>
>>
>
>> 1. If I drop record for "alice" from db and then access column data in 
>> prepared stmt will it work OK?
>
>
>
> If the select statement wasn't reset or finalized, you won't be able to 
> delete a record from the database (or, in WAL journaling mode, you would be 
> able to, but the reader would still see original data).
>
>
>
>> 2. Can I drop client table while having such prepared stmt not closed?
>
>
>
> What do you mean by "closed" here?
>
>
>
>> 3. If I have a lot of such stmts (100 or more) will it somehow affect 
>> performance of SQLite or waste lots of memory?
>
>
>
> No, not really. There is a reasonably small data structure associated with a 
> prepared statement, on the order of a few hundred bytes perhaps.
>
>
>
>> Are there some kind of cursors for each selecting prepared stmt?
>
>
>
> In a sense. After a call to sqlite3_step and before a call of sqlite3_reset 
> or sqlite3_finalize, you might think of a statement handle as a form of a 
> cursor. Each subsequent sqlite3_step call advances this cursor forward by one 
> row.
>
>
>
>> 4. If there is a cursor, maybe there is a way to disconnect a cursor from 
>> stmt keeping the single record still available?
>
>
>
> No, short of making a copy of every column's value.
>
> --
>
> Igor Tandetnik
>
>
>
> ___
>
> sqlite-users mailing list
>
> sqlite-users@sqlite.org
>
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
___
>
sqlite-users mailing list
>sqlite-users@sqlite.org
>http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] A question about prepared statements

2012-10-21 Thread Григорий Григоренко



Fri, 19 Oct 2012 15:49:07 -0400 от "Igor Tandetnik" <itandet...@mvps.org>:
>Григорий Григоренко <grigore...@mail.ru> wrote:
>
> I am using prepared stmts to cache selected record for later use. I have a 
> wrapper class, that has methods like AsString(name),
>
> AsFloat(name) etc to extract data for a column name. 
>
> 
>
> I prepare, bind and then do Sqlite3_step() to get record. Later I use 
> sqlite3_column_XXX() to access data.
>
> 
>
> My concern is - do open selective prepared stmts depend on something in db or 
> lock something in db?
>
>
Yes, until you call sqlite3_reset or sqlite3_finalize. For as long as the 
statement is "active" (meaning, sqlite3_reset or sqlite3_finalize has not been 
called after the most recent sqlite3_step call), it holds a read transaction 
open.
OK.



>
>
> Var Alice := Db.SelectRecord('SELECT * FROM client WHERE client_name = ?', [  
> 'alice' ]);
>
> 
>
> 1. If I drop record for "alice" from db and then access column data in 
> prepared stmt will it work OK?
>
>
If the select statement wasn't reset or finalized, you won't be able to delete 
a record from the database (or, in WAL journaling mode, you would be able to, 
but the reader would still see original data).
What kind of error is returned in this case?

Actually, I should (and would) find out this via simple test app )


>
>
> 2. Can I drop client table while having such prepared stmt not closed?
>
>
What do you mean by "closed" here?
Finalized.



>
>
> 3. If I have a lot of such stmts (100 or more) will it somehow affect 
> performance of SQLite or waste lots of memory?
>
>
No, not really. There is a reasonably small data structure associated with a 
prepared statement, on the order of a few hundred bytes perhaps.
>
>
> Are there some kind of cursors for each selecting prepared stmt? 
>
>
In a sense. After a call to sqlite3_step and before a call of sqlite3_reset or 
sqlite3_finalize, you might think of a statement handle as a form of a cursor. 
Each subsequent sqlite3_step call advances this cursor forward by one row.
>
>
> 4. If there is a cursor, maybe there is a way to disconnect a cursor from 
> stmt keeping the single record still available?
>
>
No, short of making a copy of every column's value.
OK.


>
-- 
>
Igor Tandetnik
>
>
___
>
sqlite-users mailing list
>sqlite-users@sqlite.org
>http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] A question about prepared statements

2012-10-19 Thread Григорий Григоренко
Hello,

I am using prepared stmts to cache selected record for later use. I have a 
wrapper class, that has methods like AsString(name), AsFloat(name) etc to 
extract data for a column name. 

I prepare, bind and then do Sqlite3_step() to get record. Later I use 
sqlite3_column_XXX() to access data. 

My concern is - do open selective prepared stmts depend on something in db or 
lock something in db? 


Var Alice := Db.SelectRecord('SELECT * FROM client WHERE client_name = ?', [  
'alice' ]);

1. If I drop record for "alice" from db and then access column data in prepared 
stmt will it work OK?

2. Can I drop client table while having such prepared stmt not closed? Will I 
be able to access data even if table was dropped?

3. If I have a lot of such stmts (100 or more) will it somehow affect 
performance of SQLite or waste lots of memory? Are there some kind of cursors 
for each selecting prepared stmt? 

4. If there is a cursor, maybe there is a way to disconnect a cursor from stmt 
keeping the single record still available? Its always about a single record. Is 
there anything to ease the burden of stopped selective stmt but not closing it?

5. Other disadvantages I might have missed?


A countrary to this is getting all the data from record to some user class and 
closing stmt. i dont like overhead of it. 
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SELECT query first run is VERY slow

2011-07-28 Thread Григорий Григоренко
So, should we file this as a defect or bug somehow?
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SELECT query first run is VERY slow

2011-07-26 Thread Григорий Григоренко
> 
> >
> > EXPLAIN QUERY PLAN SELECT * FROM foo WHERE bar = 'three' and id > 290
> > LIMIT 10;
> >
> > SELECT * FROM foo WHERE bar = 'three' and id > 290 LIMIT 10;
> >
> >
> > Run query. Now using idx2 index SQLITE reads only about 20 Kbytes!
> >
> >
> >
> Grigory, it seems you just added a field and copied rowid column to it, so
> your fast query is equivalent to queries with general fields that works as
> expected (see my post about a,b,c fields). Id that is mapped to rowid is
> still has special treatment and not used effectively. This still has to be
> explained. But if your requirements to your base allows you to add extra
> field, I think you can use it, just support autoincrement feature some other
> way.
> 


Well, as I wrote it's workaround. One can use it to avoid massive reads until 
SQLITE behaviour is fixed.

I absolutely agree with you that this behaviour of SQLITE should be explained. 

Or considered as defect and fixed.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SELECT query first run is VERY slow

2011-07-26 Thread Григорий Григоренко
It's all about caching. 

If OS has already cached index data query executes in less than a second time.

To understand what is actually happening you should monitor reading count of 
SQLITE console process (or your app that is executing).

There shouldn't be reading of more than 1 Mb if SQLITE is using index properly.


26 июля 2011, 20:27 от "Black, Michael (IS)" <michael.bla...@ngc.com>:
> If I run your sql.txt script with the normalization of kind my first time 
> query shows 0 seconds.
> 
> D:\x>sqlite3 sq1  0|0|0|SEARCH TABLE log USING INDEX idxlog_kind_computer (kind=? AND 
> computer=?) (~406234 rows)
> Seconds elapsed: 0
> 
> 
> 
> Does yours show a longer time than that and/or a different plan?
> 
> 
> 
> 
> 
> Michael D. Black
> 
> Senior Scientist
> 
> NG Information Systems
> 
> Advanced Analytics Directorate
> 
> 
> 
> ____
> From: sqlite-users-boun...@sqlite.org [sqlite-users-boun...@sqlite.org] on 
> behalf of Григорий Григоренко [grigore...@mail.ru]
> Sent: Tuesday, July 26, 2011 8:16 AM
> To: General Discussion of SQLite Database
> Subject: EXT :Re: [sqlite]SELECT query first run is VERY slow
> 
> >
> > select count(*) as cnt,kind,computer
> > from log
> > group by kind,computer
> > order by cnt desc
> >
> > what happens here?
> >
> > SELECT *
> >  FROM log INDEXED BY idxlog_kind_computer
> >  WHERE kind = 'info' AND computer=1 and id > 7070636
> >  LIMIT 100;
> >
> > there are 3_022_148 identical entries 'info,1' in your index
> >
> > sqlite has to traverse near all of them and so it is not much help with
> > binary search.
> > Drop this index, run ANALYZE and sqlite will use your primary key quite
> > fast.
> 
> 
> The problem is: choosing primary key index is not always a good choice.
> Sometimes it can lead to scanning half of table records.
> 
> 
> Let's assume we have:
> 
> CREATE TABLE foo(bar);
> CREATE INDEX idx ON foo(bar);
> 
> INSERT INTO foo VALUES('one');
> .. 1 mln inserts on one..
> INSERT INTO foo VALUES('one');
> 
> INSERT INTO foo VALUES('two');
> .. 1 mln inserts of two..
> INSERT INTO foo VALUES('two');
> 
> INSERT INTO foo VALUES('three');
> .. 1 mln inserts of three..
> INSERT INTO foo VALUES('three');
> 
> Now, the query:
> 
> SELECT * FROM foo WHERE bar='three' and id > 1 LIMIT 10;
> 
> If query planner uses primary index it'll have to scan 2 mln records before 
> he hits first one matching query.
> 
> (speaking about "normal" settings, SQLITE scans from lesser rowid to greater)
> 
> And if query planner can somehow use index idx with both supported values { 
> bar:'three', id:1 } there will be no scanning.
> 
> It's binary search and it's what I need in my app.
> 
> 
> 
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
> 
> 
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SELECT query first run is VERY slow

2011-07-26 Thread Григорий Григоренко
I found some kind of workaround to solve this problem.

Create new database and run:


CREATE TABLE foo(bar);

INSERT INTO foo VALUES(null);  INSERT INTO foo VALUES(null);  INSERT INTO foo 
VALUES(null); 

BEGIN;

INSERT INTO foo SELECT bar FROM foo; INSERT INTO foo SELECT bar FROM foo; 
INSERT INTO foo SELECT bar FROM foo;
INSERT INTO foo SELECT bar FROM foo; INSERT INTO foo SELECT bar FROM foo; 
INSERT INTO foo SELECT bar FROM foo;
INSERT INTO foo SELECT bar FROM foo; INSERT INTO foo SELECT bar FROM foo; 
INSERT INTO foo SELECT bar FROM foo;
INSERT INTO foo SELECT bar FROM foo; INSERT INTO foo SELECT bar FROM foo; 
INSERT INTO foo SELECT bar FROM foo;
INSERT INTO foo SELECT bar FROM foo; INSERT INTO foo SELECT bar FROM foo; 
INSERT INTO foo SELECT bar FROM foo;
INSERT INTO foo SELECT bar FROM foo; INSERT INTO foo SELECT bar FROM foo; 
INSERT INTO foo SELECT bar FROM foo;
INSERT INTO foo SELECT bar FROM foo; INSERT INTO foo SELECT bar FROM foo;

COMMIT;

DELETE FROM foo WHERE rowid > 300;

UPDATE foo SET bar = 'one' WHERE rowid <= 100;
UPDATE foo SET bar = 'two' WHERE rowid > 100 AND rowid < 200;
UPDATE foo SET bar = 'three' WHERE rowid > 200;

CREATE INDEX idx1 ON foo(bar);


This creates foo table with 3 mln records: 1 mln with bar = 'one', 1 mln with 
bar = 'two' and 1 mln with bar = 'three'.


Now, the query.

EXPLAIN QUERY PLAN SELECT * FROM foo WHERE bar = 'three' and rowid > 290 
LIMIT 10;

SELECT * FROM foo WHERE bar = 'three' and rowid > 290 LIMIT 10;


While running this query SQLITE  reads 18 Mbytes (and it uses idx1 index).

18 MBYTES.


Now, run this:

ALTER TABLE foo ADD COLUMN id; UPDATE foo SET id = rowid; CREATE INDEX idx2 ON 
foo(bar, id);


And update query: change rowid to id:

EXPLAIN QUERY PLAN SELECT * FROM foo WHERE bar = 'three' and id > 290 LIMIT 
10;

SELECT * FROM foo WHERE bar = 'three' and id > 290 LIMIT 10;


Run query. Now using idx2 index SQLITE reads only about 20 Kbytes!


20 KBYTES.




This script proves that using rowid in index is possible and eliminates the 
need of reading lots of data.

But SQLITE is not using it.  I think this behaviour is SQLITE defect. 










___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SELECT query first run is VERY slow

2011-07-26 Thread Григорий Григоренко
> 
> select count(*) as cnt,kind,computer
> from log
> group by kind,computer
> order by cnt desc
> 
> what happens here?
> 
> SELECT *
>  FROM log INDEXED BY idxlog_kind_computer
>  WHERE kind = 'info' AND computer=1 and id > 7070636
>  LIMIT 100;
> 
> there are 3_022_148 identical entries 'info,1' in your index
> 
> sqlite has to traverse near all of them and so it is not much help with
> binary search.
> Drop this index, run ANALYZE and sqlite will use your primary key quite
> fast.


The problem is: choosing primary key index is not always a good choice. 
Sometimes it can lead to scanning half of table records.


Let's assume we have:

CREATE TABLE foo(bar);
CREATE INDEX idx ON foo(bar);

INSERT INTO foo VALUES('one');
.. 1 mln inserts on one..
INSERT INTO foo VALUES('one');

INSERT INTO foo VALUES('two');
.. 1 mln inserts of two..
INSERT INTO foo VALUES('two');

INSERT INTO foo VALUES('three');
.. 1 mln inserts of three..
INSERT INTO foo VALUES('three');

Now, the query:

SELECT * FROM foo WHERE bar='three' and id > 1 LIMIT 10;

If query planner uses primary index it'll have to scan 2 mln records before he 
hits first one matching query.

(speaking about "normal" settings, SQLITE scans from lesser rowid to greater)

And if query planner can somehow use index idx with both supported values { 
bar:'three', id:1 } there will be no scanning.

It's binary search and it's what I need in my app.



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SELECT query first run is VERY slow

2011-07-26 Thread Григорий Григоренко


26 июля 2011, 16:42 от res...@googlemail.com:
> Think about the distribution of your Data.
> 
> select count(*) as cnt,kind,computer
> from log
> group by kind,computer
> order by cnt desc
> 
> what happens here?
> 
> SELECT *
>  FROM log INDEXED BY idxlog_kind_computer
>  WHERE kind = 'info' AND computer=1 and id > 7070636
>  LIMIT 100;
> 
> there are 3_022_148 identical entries 'info,1' in your index


If there wasn't id condition in WHERE I would agree. 

But in my opinion this is not that simple.

Each index record includes ROWID. At the end. 

http://www.sqlite.org/queryplanner.html

"An index is another table similar to the original "fruitsforsale" table but 
with the content (the fruit column in this case) stored in front of the rowid 
and with all rows in content order."

http://www.sqlite.org/images/qp/idx1.gif

"The "fruit" column is the primary key used to order the elements of the table 
and the "rowid" is the secondary key used to break the tie when two or more 
rows have the same "fruit". In the example, the rowid has to be used as a 
tie-breaker for the "Orange" rows. Notice that since the rowid is always unique 
over all elements of the original table, the composite key of "fruit" followed 
by "rowid" will be unique over all elements of the index."

And this makes each index record different.

WHERE condition is providing query planner with 3 nice values to match exact 
index record.

And SQLITE ignores this knowledge and uses only first 2 values to search index. 

Why? 



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SELECT query first run is VERY slow

2011-07-26 Thread Григорий Григоренко


> Could you post your timings and read stats again?  Are you happy with what 
> you're seeing now?
> 

Actually, there was ~ 50% speed-up. More or less. The idea of normalizing 
worked great, thank you.

I'm concerned about SQLITE indexes. 


> 
> 
> I think when you drop the index and recreate it still scatters.  But re-doing 
> your data with the new kind will tell for sure.  I take it you have to build 
> the index while running and can't wait until it's done.


To me problem is narrowed to the fact that SQLITE is ignoring ROWID value while 
scanning index.

Using "INDEX ON Foo(Bar)"  and condtion like "Bar = XXX and Rowid > YYY" SQLITE 
reads too much.

Looks like in this case SQLITE reads all of index records matching condition 
Bar = XXX and scan them.

And if Bar has only a few allowed values (= bad selectivity ) this takes lot's 
of time and disk i/o cause there are lots of records with the same Bar value.

Looks like SQLITE is not using knowledge that index records are sorted by Bar 
and then by Rowid.


This is something I don't (and didn't) expect...




> 
> 
> 
> 
> 
> Michael D. Black
> 
> Senior Scientist
> 
> NG Information Systems
> 
> Advanced Analytics Directorate
> 
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SELECT query first run is VERY slow

2011-07-26 Thread Григорий Григоренко
> 
> > I need query to be ORDER BY id DESC. I've dropped this ORDER BY to simplify 
> > my case. But in real-life app I need it.
> > 
> > So, index on (kind, computer) has these index records:
> > 
> > [ KIND ] [ COMPUTER ] [ ID ]
> 
> I don't know that SQLite does an inherent addition of the 'id' column to all 
> INDEXes.  Some index implementations do this because their algorithm requires 
> that all positions in an index are unique, but I don't think SQLite works 
> like that.  I think that if you want 'id' to be part of an index you have to 
> say that in the definition.

SQLITE always stores rowid in every index record. This rowid let SQLITE find 
approriate database record. You can read more here: 
http://www.sqlite.org/queryplanner.html 

> 
> Nevertheless this still doesn't explain why your first run of a query is so 
> much slower than subsequent runs.  SQLite's cache size is set here:
> 
> http://www.sqlite.org/pragma.html#pragma_cache_size
> 
> If the cache size described there doesn't explain the behaviour you're 
> seeing, the problem isn't with SQLite, it's in your application or the OS or 
> your hardware.


The problem is filling the cache. SQLITE wants too much data to execute simple 
query.

Cache size of SQLITE doesn't matter because OS does it's own caching. After 
query has been run for a first time OS caches disk pages.

And subsequent queries immediately gets data from RAM not from disk.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SELECT query first run is VERY slow

2011-07-26 Thread Григорий Григоренко
Did that. Timings has decreased.  As I understand it it's about decreasing 
index size (that includes kind column).



To me the problem is still there. If my database will have 10 mln log records 
first running query will stuck again :(



I don't understand SQLITE strategy. Let me explain.

Index is an array of index records. They fill pages in database. 

Searching with B-Tree index is similar to binary search in ordered array, isn't 
it? You pick record in a middle of array subset and compare to conditional 
value.

This step let you drop half of index subset from search.

Let's say size of index is 100 Mb and it contains 4 mln index records. 

This is 100 Mb / 8 Kb (size of page) ~ 12000 pages.

While doing binary search for 4 mln records we do 22 compares at most (2^22 ~ 4 
mln).

Assume worst case - all of these comparings use different pages. 

So, we need to read 22 pages = 180 Kb.

Surely there's additional data to be read for index. Like some intermediate 
nodes in B-Tree.

Let's triple the number of pages, 66 pages = 540 Kb.


But SQLITE reads in this case ~ 50 Mb!! 


This leads us to conclusion: index in SQLITE database if scattered and cannot 
be "jumped directly" to N-th element. SQLITE has to read it somehow 
consecutively. 

And so SQLITE has to read half of index (!) to find matching index record.


Am I getting it right?






25 июля 2011, 19:35 от "Black, Michael (IS)" <michael.bla...@ngc.com>:
> You need to normalize your "kind" value.
> 
> 
> 
> .pragma cache_size=15000;
> 
> drop index idxlog_kind_computer;
> 
> create table kind(id integer,kind text);
> insert into kind values(1,'debug');
> insert into kind values(2,'error');
> insert into kind values(3,'info');
> insert into kind values(4,'timing');
> insert into kind values(5,'warn');
> update log set kind=1 where kind='debug';
> update log set kind=2 where kind='error';
> update log set kind=3 where kind='info';
> update log set kind=4 where kind='timing';
> update log set kind=5 where kind='warn';
> create index idxlog_kind_computer ON log(kind,computer);
> 
> 
> 
> Then see how long your first query takes.
> 
> 
> 
> Michael D. Black
> 
> Senior Scientist
> 
> NG Information Systems
> 
> Advanced Analytics Directorate
> 
> 
> 
> 
> From: sqlite-users-boun...@sqlite.org [sqlite-users-boun...@sqlite.org] on 
> behalf of Григорий Григоренко [grigore...@mail.ru]
> Sent: Monday, July 25, 2011 8:45 AM
> To: sqlite-users@sqlite.org
> Subject: EXT :Re: [sqlite]SELECT query first run is VERY slow
> 
> I think I narrowed the problem a bit.  Guys, hope I'm not bothering you too 
> much :)
> 
> 
> I've calculated size of index (it is index on log (kind,computer) ) of its 
> own: dropped index, run VACUUM and re-created index.
> 
> Database file increased by 105 Mb (and sqlite3 process counter shows that 
> there were ~105 Mb written to disk).
> 
> 
> This means that index on log(kind, computer) takes 105 Mb of database file 
> (and whole size of database is 1259 Mb).
> 
> 
> Now, I'm running query which is using this index (and is not returning any 
> data) and monitor that sqlite3 process reads ~50 Mb.
> 
> 
> So there are  two major problems here.
> 
> 1) SQLITE has to read about _half of index_ before it can use it (and 
> understand there are no records matching query).
> 
> If cache is enough to hold 50 Mb then on subsuquent queries sqlite process is 
> not reading at all.
> 
> 2) SQLITE is reading abnormally slowly during this first-time running query  
> (waiting for something a lot?).
> 
> During index creation I monitored sqlite3 process and it was consuming CPU at 
> ~20% rate and it's doing I/O at ~10 Mb per second rate.
> That's what I call "normal load"!
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
> 
> 
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SELECT query first run is VERY slow

2011-07-26 Thread Григорий Григоренко
> >
> > 1) SQLITE has to read about _half of index_ before it can use it (and 
> > understand there are no records matching query).
> >
> > If cache is enough to hold 50 Mb then on subsuquent queries sqlite process 
> > is not reading at all.
> >
> 
> Please, post your query. To understand whether sqlite reads too much
> or not it's better to know what exactly you want to select.
> 


This is script I ran:

CREATE TABLE IF NOT EXISTS T(t); DELETE FROM T; INSERT INTO T VALUES( 
strftime('%s', 'now' ) );

EXPLAIN QUERY PLAN SELECT * FROM log  INDEXED BY idxlog_kind_computer WHERE 
kind = 'info' AND computer=1 and id > 7070636 LIMIT 100;

SELECT * FROM log INDEXED BY idxlog_kind_computer WHERE kind = 'info' AND 
computer=1 and id > 7070636 LIMIT 100;

SELECT "Seconds elapsed: " || (strftime('%s', 'now' ) - t) FROM T ;



> 
> > 2) SQLITE is reading abnormally slowly during this first-time running query 
> >  (waiting for something a lot?).
> >
> 
> Is this with the recreated index or still the one that was created
> during the lifetime of your program?
> 


It doesn't matter. Tried it with old and with recreated index, same behaviour.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SELECT query first run is VERY slow

2011-07-26 Thread Григорий Григоренко
> 
> I thought that, if you have an index on a,b,c,d
> than you should not have an index on a,b,c too
> because if you use those 3 field in the where-clause, use can be made of
> the 4-field index
> 

I'm not sure. Let me explain.

I need query to be ORDER BY id DESC. I've dropped this ORDER BY to simplify my 
case. But in real-life app I need it.

So, index on (kind, computer) has these index records:

[ KIND ] [ COMPUTER ] [ ID ]
...

They are ordered by kind, computer and then by id. So ORDER BY id DESC comes 
free.
Query planner just need to retrieve records from subset of index records 
starting at last one backwards.

Index on (kind,computer, process,who) has these index records:

[ KIND ] [ COMPUTER ] [ PROCESS ] [ WHO ] [ ID ]

This time having found subset of index records query planner cannot start 
retrieving them from the last to first.
It has to sort them by ID at first. Isn't it?


Maybe I'm not getting it right?
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SELECT query first run is VERY slow

2011-07-25 Thread Григорий Григоренко
I think I narrowed the problem a bit.  Guys, hope I'm not bothering you too 
much :)


I've calculated size of index (it is index on log (kind,computer) ) of its own: 
dropped index, run VACUUM and re-created index.

Database file increased by 105 Mb (and sqlite3 process counter shows that there 
were ~105 Mb written to disk).


This means that index on log(kind, computer) takes 105 Mb of database file (and 
whole size of database is 1259 Mb).


Now, I'm running query which is using this index (and is not returning any 
data) and monitor that sqlite3 process reads ~50 Mb. 


So there are  two major problems here.

1) SQLITE has to read about _half of index_ before it can use it (and 
understand there are no records matching query).

If cache is enough to hold 50 Mb then on subsuquent queries sqlite process is 
not reading at all. 

2) SQLITE is reading abnormally slowly during this first-time running query  
(waiting for something a lot?).

During index creation I monitored sqlite3 process and it was consuming CPU at 
~20% rate and it's doing I/O at ~10 Mb per second rate.
That's what I call "normal load"! 














___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] EXT :Re: SELECT query first run is VERY slow

2011-07-25 Thread Григорий Григоренко
> 
> #1 I take it your query is CPU bound the first time?

Hmm, not sure. CPU was still under 1 % usage.

> 
> #2 Can you show us the query planner please?
> #3 Can you show us the query planner minus the "INDEXED BY"?

Sure. Executing:

SELECT * FROM log INDEXED BY idxlog_kind_computer  WHERE kind = 'info' AND 
computer=1 and id > 7070636 LIMIT 100;

Query plan:

0|0|0|SEARCH TABLE log USING INDEX idxlog_kind_computer (kind=? AND computer=?) 
(~3 rows)

Executing:

SELECT * FROM log WHERE kind = 'info' AND computer=1 and id > 7070636 LIMIT 100;

Query plan:

0|0|0|SEARCH TABLE log USING INDEX idxlog_kind_computer_process_who (kind=? AND 
computer=?) (~3 rows)


Query planner has selected idxlog_kind_computer_process_who index.



> 
> 

> #4 Can you show us sqlite_stat1?

Actually, there isn't any. I've not ran ANALYZE on this database. 
Since I'm using INDEXED BY there's no need in query planner decisions.

> 
> #5 Can you show us your tables now?
> 

You mean db schema?

// table with computer, process and who values
CREATE TABLE item(name text,value text);
CREATE UNIQUE INDEX idxitem_name_value ON item(name,value);

// log-table
CREATE TABLE log(id integer primary key autoincrement,msg text,created_at 
int,kind,computer,process,who);

CREATE INDEX idxlog_created_at ON log(created_at);
CREATE INDEX idxlog_kind ON log(kind);
CREATE INDEX idxlog_kind_computer ON log(kind,computer);
CREATE INDEX idxlog_kind_computer_process ON log(kind,computer,process);
CREATE INDEX idxlog_kind_computer_process_who ON log(kind,computer,process,who);

Since I've normalized data selecting (in real-life application) looks like:
SELECT ... kind,msg,c.value as computer,p.value as process,w.value as 
who,created_at,id FROM log ...
 LEFT JOIN item c ON c.rowid=log.computer 
 LEFT JOIN item p ON p.rowid=log.process 
 LEFT JOIN item w ON w.rowid=log.who 
...



> #6 What happens if you do "pragma cache_size=15000"?  I'd make cache_size = 
> (readbytes/8192)*1.1 at least.

(readbytes/8192)*1.1 gives us ~ 6700 pages. Tried "pragma cache_size=15000;" 
and nothing changed. 

Sqlite3 still has read ~ 50 Mb and spent 27 seconds on first query run.



> 
> #7 Care to post your database again?
> 

http://dl.dropbox.com/u/2168777/db2.rar

> 
> 
> It still seems to me this should run faster the first time unless you have a 
> really slow disk system or sqlite is doing something silly (which I doubt).
> 
> Perhaps the cache being to small is hurting things.

I'm not  sure cache settings are important.

You see, I've tried running query with different cache_size. Even with 
cache_size=100. 
Tried cache_size from 100 to 25000 and monitored memory usage of console 
process to make sure
it was using only the amount of memory given by cache.

Timing is _always_ ~ 1 second on subsequent (not first) runs. You can try it 
yourself with attached data.







> 
> 
> 
> Michael D. Black
> 
> Senior Scientist
> 
> NG Information Systems
> 
> Advanced Analytics Directorate
> 
> 
> 
> 
> From: sqlite-users-boun...@sqlite.org [sqlite-users-boun...@sqlite.org] on 
> behalf of Григорий Григоренко [grigore...@mail.ru]
> Sent: Monday, July 25, 2011 3:30 AM
> To: sqlite-users
> Subject: EXT :Re: [sqlite]SELECT query first run is VERY slow
> 
> Okay, here are some steps I've done.
> 
> 1) normalize db;
> 
> I've created single table (item). "computer","process" and "who" fields in 
> log table became rowid integers pointing to this table.
> "kind" is still a short string.
> 
> 2) give up covering indexes;
> 
> For those not aware (http://www.sqlite.org/queryplanner.html , 
> 1.7 Covering Indices) it's an index that has additional columns at it's end ( 
> being selected by query).
> Covering index eliminate the need to read data from db records. But it 
> increases size of index and size of database.
> 
> 3) use INDEXED BY to suggest index to query planner.
> 
> Since indexes are created exactly for specific query.
> 
> 
> Software was working during weekend.
> 
> Yesterday's database has 6 mln records and it's only 1.1 Gb in size.
> 
> Comparing to 4 mln records and 3.5 Gb size before.
> 
> Now, the long running query took 27 seconds and it has read 50 Mb from 
> database (compare to 2 minutes and 307 Mb before).
> 
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
> 
> 
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SELECT query first run is VERY slow

2011-07-25 Thread Григорий Григоренко
Okay, here are some steps I've done.

1) normalize db;

I've created single table (item). "computer","process" and "who" fields in log 
table became rowid integers pointing to this table. 
"kind" is still a short string.

2) give up covering indexes;

For those not aware (http://www.sqlite.org/queryplanner.html , 1.7 Covering 
Indices) it's an index that has additional columns at it's end ( being selected 
by query).
Covering index eliminate the need to read data from db records. But it 
increases size of index and size of database.

3) use INDEXED BY to suggest index to query planner.

Since indexes are created exactly for specific query.


Software was working during weekend. 

Yesterday's database has 6 mln records and it's only 1.1 Gb in size.

Comparing to 4 mln records and 3.5 Gb size before.

Now, the long running query took 27 seconds and it has read 50 Mb from database 
(compare to 2 minutes and 307 Mb before).

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SELECT query first run is VERY slow

2011-07-24 Thread Григорий Григоренко
ANALYZE lasted for ~ 15 minutes.


24 июля 2011, 17:21 от Tito Ciuro :
> Hi,
> 
> It has worked fairly well with small databases, but I see the problem with 
> medium to large files. Have you tried to run ANALYZE on your database? I'm 
> curious to know how long it takes.
> 
> -- Tito
> 
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SELECT query first run is VERY slow

2011-07-24 Thread Григорий Григоренко
> 
> Perhaps my post dated Aug. 19, 2009 will help a little bit:
> 
> http://www.cocoabuilder.com/archive/cocoa/242954-core-data-dog-slow-when-using-first-time-after-boot.html
> 
> -- Tito
> 

Thanks for sharing.  "warming file" is a way to cache whole database as I 
understand it.

After everything is cached scattered reading from database runs faster. 

Unfortunately, in my case base size is ~ 3.5 Gb; it's too big for this strategy.

Even if I read at 25 Mb/s rate it will took 3500 / 25 = ~140 seconds just to 
read whole db file.

And what's more important I've only 2 Gb of RAM. 

Anyway, thanks for sharing. I guess these cases are similar. 


To me problem looks like this: 

SQLITE needs to read  (cache) from db a lot (too much?) while first-time query 
execution even if a query uses nicely matched index and returns nothing.

And SQLITE is doing lot's of scattered readings during query execution; not 
trying to somehow batch read or similar. That's why file caching helps.

If it's true not sure there's a simple and nice solution. 

I'll try some ideas (including normalization) and report results in this topic 
next week.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SELECT query first run is VERY slow

2011-07-22 Thread Григорий Григоренко
Thank you for explaining this. 

I guess you're right about query planner deciding to avoid index usage based on 
stats. 

22 июля 2011, 18:30 от Richard Hipp <d...@sqlite.org>:
> On Fri, Jul 22, 2011 at 9:48 AM, Григорий Григоренко 
> <grigore...@mail.ru>wrote:
> 
> > >
> > > Please post the results of:
> > >
> > > SELECT * FROM sqlite_stat1;
> > >
> > >
> >   tbl = log
> >  idx = idxlog_kind_computer_process_who_id_msg_created_at
> >  stat = 2815667 563134 563134 469278 74097 1 1 1
> >
> 
> The first number on "stat" is the number of rows in the table.  The 2nd
> number is the average number of rows that have the same value for the first
> column of the index.  The 3rd number is the average number of rows that have
> the same value for the first 2 columns of the index.  And so forth.
> 
> There are a huge number of rows that have the same value for the first 4
> terms of this index, which shows us that this is a really lousy index.
> Ideally, you want the second number in the "stat" column to be something
> small, like 10.
> 
> If you do not run ANALYZE, SQLite has no way of knowing that the index is
> mostly useless.  SQLite assumes that the index is a good one, and that the
> 2nd integer in "stat" is 10.  And it therefore tries to use the index.  But
> since the index is so bad, the resulting performance is slow.
> 
> After running ANALYZE, SQLite realizes that the index is lousy and avoids
> using it.  Hence, performance is much better.
> 
> 
> >
> >  tbl = log
> >  idx = idxlog_kind_computer_process_id_who_msg_created_at
> >  stat = 2815667 563134 563134 469278 1 1 1 1
> >
> >  tbl = log
> >  idx = idxlog_kind_computer_id_process_who_msg_created_at
> >  stat = 2815667 563134 563134 1 1 1 1 1
> >
> >  tbl = log
> >  idx = idxlog_kind_id_computer_process_who_msg_created_at
> >  stat = 2815667 563134 1 1 1 1 1 1
> >
> >  tbl = log
> >  idx = idxlog_kind_computer_process_who_msg_created_at
> >  stat = 2815667 563134 563134 469278 74097 2 2
> >
> >  tbl = log
> >  idx = idxlog_created_at
> >  stat = 2815667 106
> >
> 
> 
> 
> -- 
> D. Richard Hipp
> d...@sqlite.org
> 
> 
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SELECT query first run is VERY slow

2011-07-22 Thread Григорий Григоренко


> > >
> > > Something that might speed up your operation is to issue the SQL command
> > 'ANALYZE' just once.  The results are stored in the database file, so you
> > can just do it manually now you have some data in the database.  It gives
> > the query optimizers lots of clues about how best to optimize each query. So
> > do an ANALYZE, then close the database and do your timing tests again.
> > >
> >
> >
> > Well, ANALYZE has changed something. Now query is using PRIMARY KEY and
> > executing almost instantly :-)
> >
> 
> Please post the results of:
> 
> SELECT * FROM sqlite_stat1;
> 
> 

Sure,

  tbl = T
  idx = 
 stat = 1

  tbl = item
  idx = idxitem_name_value
 stat = 21 7 1

  tbl = log
  idx = idxlog_kind_computer_process_who_id_msg_created_at
 stat = 2815667 563134 563134 469278 74097 1 1 1

  tbl = log
  idx = idxlog_kind_computer_process_id_who_msg_created_at
 stat = 2815667 563134 563134 469278 1 1 1 1

  tbl = log
  idx = idxlog_kind_computer_id_process_who_msg_created_at
 stat = 2815667 563134 563134 1 1 1 1 1

  tbl = log
  idx = idxlog_kind_id_computer_process_who_msg_created_at
 stat = 2815667 563134 1 1 1 1 1 1

  tbl = log
  idx = idxlog_kind_computer_process_who_msg_created_at
 stat = 2815667 563134 563134 469278 74097 2 2

  tbl = log
  idx = idxlog_created_at
 stat = 2815667 106
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SELECT query first run is VERY slow

2011-07-22 Thread Григорий Григоренко
Database is "insert-only". There wasn't any deletes or updates, will VACUUM 
actually help in this case? I though it was about unused space?



Still. There's an index:

 idxlog_kind_computer_id_process_who_msg_created_at (

kind, computer, id, ...

)

Query is

kind = XXX AND computer = YYY and id BETWEEN ZZZ1 and ZZZ2

To my opinion this information is sufficient to jump directly to index position 
XXX,YYY,ZZZ1.

SQLITE must be able to do this real fast, isn't it?

And after finding this position inside index SQLITE clearly sees that previous 
and next record does not match query, so query should return nothing. 

This cannot take long. It's like scanning B-TREE to a predefined position and 
then reading just 2 records near.

And that's all. Why spending 2 minutes? 


Does SQLITE needs all of index to be in RAM to begin using  it? 
If so it can be the reason of bad performance...


22 июля 2011, 17:07 от Max Vlasov <max.vla...@gmail.com>:
> On Fri, Jul 22, 2011 at 3:50 PM, Григорий Григоренко <grigore...@mail.ru> 
> wrote:
> > Okay, I rebooted and tested again.
> >
> > First run took 76 seconds. Read bytes: ~ 307 Mb, Write bytes: ~ 66 Kb.
> >
> > Re-running: 1 second, Read bytes: ~ 307 Mb, Write bytes ~ 66 Kb.
> >
> >
> 
> Grigory, you posted to me directly (without the list e-mail), I'm
> reposting this reply to the list
> 
> Ok, I downloaded the db and it actually takes long time to execute.
> But this can be explained. Actually if you change your fist id > to
> zero, the query starts to return actual results, so there are many
> records with kind='info' and computer = 'KRAFTWAY' in your database.
> So sqlite really uses the index for locating your records. But the
> problem is also that your db is very fragmented so sqlite when reading
> the index actually reads different parts of this file and it makes the
> windows cache system mad. I think vacuum command should help.
> 
> Max
> 
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SELECT query first run is VERY slow

2011-07-22 Thread Григорий Григоренко
You are right. There's a way of normalizing and it will certainly reduce 
database size. 
Let's say it'll shrink by 50%. Still, if first-time runnning query timing will 
change from 2 min to 1 min it is still not acceptable. 
I cannot wait even a minute without logging.

So, before restructuring database and re-writing code I just want to understand 
SQLITE behaviour.
The question is: does SQLITE need to load whole index to start using it? Or 
SQLITE can use partial loads to quickly find records it needs (defined by 
columns in conditions)?





> You need to normalize your database.
> 
> 
> 
> 
> 
> 
> 
> Though it's easy to put everything in one table it's horrendously inefficient 
> for indexing.
> 
> 
> 
> Your indexes are huge as you're putting strings in them.
> 
> 
> 
> 
> 
> 
> 
> Ideally you should never have a string in an index if you can avoid it.  Too 
> much space, too long a time to index, to long a time to compare.
> 
> 
> 
> 
> 
> 
> 
> So for starters create a "kind" table and "computer" table and just put the 
> _rowid into your table instead of the string.
> 
> 
> 
> 
> 
> 
> 
> I'll bet you things will run a lot faster and your database will shrink.
> 
> 
> 
> 
> 
> 
> 
> Same goes for other strings.
> 
> 
> 
> 
> 
> 
> 
> Michael D. Black
> 
> 
> 
> Senior Scientist
> 
> 
> 
> NG Information Systems
> 
> 
> 
> Advanced Analytics Directorate
> 
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SELECT query first run is VERY slow

2011-07-22 Thread Григорий Григоренко



> 
> Something that might speed up your operation is to issue the SQL command 
> 'ANALYZE' just once.  The results are stored in the database file, so you can 
> just do it manually now you have some data in the database.  It gives the 
> query optimizers lots of clues about how best to optimize each query. So do 
> an ANALYZE, then close the database and do your timing tests again.
> 


Well, ANALYZE has changed something. Now query is using PRIMARY KEY and 
executing almost instantly :-)

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SELECT query first run is VERY slow

2011-07-22 Thread Григорий Григоренко
> 
> > Database extension is ".dblite"
> 
> That should be okay.  No need to change that.
> 
> Something that might speed up your operation is to issue the SQL command 
> 'ANALYZE' just once.  The results are stored in the database file, so you can 
> just do it manually now you have some data in the database.  It gives the 
> query optimizers lots of clues about how best to optimize each query. So do 
> an ANALYZE, then close the database and do your timing tests again.

Okay, will do that.

> 
> If you're still getting strange results after that, you can use the 
> command-line tool to make SQLite reveal whether it's using an index or not.  
> You do that by issuing your SELECT command with 'EXPLAIN QUERY PLAN ' at the 
> beginning of it:
> 
> EXPLAIN QUERY PLAN SELECT 1 as today, id as rowid, 
> kind,who,msg,computer,process,created_at,id FROM log WHERE id > 4070636 AND 
> id <= 9223372036854775807 AND kind = 'info' AND computer='KRAFTWAY';
> 
> Looking at the output it produces (which you may post, by all means) any 
> mention of 'scan' means it's having to read every record in the database, 
> whereas a mention of an index means it's using an index as expected.
> 

You missed it in my post - I did it. I did "EXPLAIN ..." and SQLITE says it's 
using covering index. There's nohting wrong in the query plan.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SELECT query first run is VERY slow

2011-07-22 Thread Григорий Григоренко
Database extension is ".dblite"

I'm using Process Explorer 
(http://technet.microsoft.com/en-us/sysinternals/bb896653) to monitor processes 
cpu and i/o usage.

During these long running queries I am not using any other program. I've 
terminated any unused service  (MS SQL, for example). System idle CPU is at 
98-99% during all the time. 

I've just tried another system (of my colleague) that has Windows 7. Same 
result: first run took 40 seconds, second: 1 second.




22 июля 2011, 15:28 от Simon Slavin :
> 
> On 22 Jul 2011, at 12:15pm, Max Vlasov wrote:
> 
> > Do you have and anitvirus software installed? It may scan the file at
> > the first usage.
> 
> What is the extension to the filename of the database file ?  Windows does 
> fancy caching for files that have some particular extensions including, IIRC, 
> '.db'.  Unfortunately although this works well for database files used for 
> Windows it's terrible for ones which aren't.
> 
> If you've used a common extension for the database file rename it '.sqlite' 
> or something that isn't going to be confused for another database format.
> 
> And to answer your question, no, it's not normal or a known bug.
> 
> Does Windows XP have some sort of task display where you can see what task is 
> hogging most of the CPU or disk access ?
> 
> Simon.
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
> 
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SELECT query first run is VERY slow

2011-07-22 Thread Григорий Григоренко
I have antivirus (avast) but it's disabled during testing.

22 июля 2011, 15:15 от Max Vlasov <max.vla...@gmail.com>:
> On Fri, Jul 22, 2011 at 2:38 PM, Григорий Григоренко <grigore...@mail.ru> 
> wrote:
> > I have a log's database. Schema :
> >
> > Query:
> >
> > SELECT 1 as today, id as rowid, kind,who,msg,computer,process,created_at,id 
> > FROM log WHERE id > 4070636 AND id <= 9223372036854775807 AND kind = 'info' 
> > AND computer='KRAFTWAY';
> >...
> >
> >
> > Reboot system. Run query. Elapsed: ~100 SECONDS. Re-run query. Now it 
> > executes in less than a second. Re-run query. The same: less than a second. 
> > Tried it several times.
> >
> >
> > RUNNING THIS QUERY JUST AFTER REBOOT TAKES UNPREDICTABLE LARGE TIME, FROM 1 
> > TO 2 MINUTES.
> >
> >
> 
> Do you have and anitvirus software installed? It may scan the file at
> the first usage.
> 
> To be sure sqlite doesn't read too much I suggest opening Task
> Manager, choosing column I/O Read Bytes to be shown and run the query.
> This also can give a hint about the software that possibly reads the
> file all the this time.
> 
> Max
> 
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] SELECT query first run is VERY slow

2011-07-22 Thread Григорий Григоренко
I have a log's database. Schema :


CREATE TABLE log(id integer primary key autoincrement,msg text,created_at 
int,kind text,computer text,process text,who text); 
CREATE INDEX idxlog_created_at ON log(created_at); 
CREATE INDEX idxlog_kind_computer_id_process_who_msg_created_at ON 
log(kind,computer,id desc,process,who,msg,created_at); 
CREATE INDEX idxlog_kind_computer_process_id_who_msg_created_at ON 
log(kind,computer,process,id desc,who,msg,created_at); 
CREATE INDEX idxlog_kind_computer_process_who_id_msg_created_at ON 
log(kind,computer,process,who,id desc,msg,created_at); 
CREATE INDEX idxlog_kind_id_computer_process_who_msg_created_at ON log(kind,id 
desc,computer,process,who,msg,created_at); 


Database: ~ 3.5 Gb size, ~ 2.8 mln records. page_size=8192 bytes, UTF-8.

Platform: Windows XP, Intel Core Duo 3 Ghz.

SQLITE: sqlite3.exe console, latest version (3.7.7.1).


Query:

SELECT 1 as today, id as rowid, kind,who,msg,computer,process,created_at,id 
FROM log WHERE id > 4070636 AND id <= 9223372036854775807 AND kind = 'info' AND 
computer='KRAFTWAY';


Query plan (EXPLAIN QUERY PLAN):

0|0|0|SEARCH TABLE log USING COVERING INDEX 
idxlog_kind_computer_process_who_id_msg_created_at (kind=? AND computer=?) (~2 
rows)

There are no records in database that match query, i.e. query returns nothing.


Now, the problem. 


Reboot system. Run query. Elapsed: ~100 SECONDS. Re-run query. Now it executes 
in less than a second. Re-run query. The same: less than a second. Tried it 
several times. 


RUNNING THIS QUERY JUST AFTER REBOOT TAKES UNPREDICTABLE LARGE TIME, FROM 1 TO 
2 MINUTES.


Is this a sqlite bug or normal behaviour?

If it's normal, there are some questions.

Am I hitting some limit with my database? Which one?

Why is SQLITE (having nicely matching index) is spending so much time to 
execute? 

SQLITE has to cache some index data? Yeah but the console process (during 
execution of this query) is reading at about 800 Kb per second rate! (while HDD 
can make 100 Mb per second) Why so serious? Execuse me, why so slow?

And SQLITE process it's not consuming CPU (less than 1 percent). What is SQLITE 
doing during this long running query? 

It's own private memory is holding still at about 20 Mb. It's not consuming CPU 
and HDD. What is he waiting for?


This problem makes SQLITE completely unusable in my project.


I've shared database and scripts (Warning: it's RAR archive, ZIP cannot hold 
files of 3.5 Gb size. You can download WINRAR here: 
http://www.rarlab.com/download.htm ):

http://dl.dropbox.com/u/2168777/deadly_sqlite.rar

Download size: ~150 Mb.

Unzip and execute "run.cmd".

You'll see execution plan and after a while there will be line "Seconds 
elapsed: XXX".









___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users