Re: [sqlite] PRAGMA optimize; == no busy handler?

2017-10-03 Thread Paul

From my personal experience, performance instability of SQLite queries 
(drastically jumping from milliseconds to seconds and back) that is fixable by 
running ANALYZE means that your queries are not optimal (some crucial indices 
are missing). The reason ANALYZE helps is because the info that planner 
receives helps to mitigate the absence of important indices by adapting the 
strategy. Say for example, you have a query that involves few tables being a 
JOIN-ed. There may be the case when SQLite has to scan one of the tables but is 
does not know which, so it uses some default logic that does not account table 
size (since SQLite does not know it). As the result, wrong table may be chosen 
(the largest one).

Why this happens exactly after the first record is inserted? Probably because 
Query Planner is smart and uses some meta info. Probably SQLite does not 
allocate a page for empty tables before the first record is inserted. Query 
Planner may take an advantage of this knowledge and optimizing query 
accordingly.

All of this is just a speculation, of course. What you definitely should do is 
run EXPLAIN QUERY PLAN on the sluggish query. Better yet, run EXPLAIN QUERY 
PLAN on *every* query, just to be sure.

-Paul


> >Don't do that
> Mostly sure, but there's some cases SQLite will skip the busyhandler and 
> immediately return, but they're not permanently-busy conditions. We have 
> multiple threads across multiple processes using the database
> 
> > why you’re doing "PRAGMA optimize" while other parts of your program are 
> > accessing the database
> Because we don't necessarily know there's other work in progress, and we 
> don't have a good opportunity to ensure we do it later but-no-later than when 
> we need it
> 
> >It makes more sense just to put ANALYZE into some sort of end-of-month 
> >procedure
> >or a manual maintenance procedure your users run only when nobody else is 
> >using the system
> That's what we did but the real world isn't that simple
> 
> Last month we did...
> * ANALYZE when the staterepository service starts (effective boot time) if it 
> hasn't been done in a while (1 or 3 days, I forget)
> * ANALYZE at the end of a deployment operation if it hasn't been done in a 
> while (1min I think)
> @ 1st login there's 30+ operations in a short window (1-2 minutes?) and folks 
> are very anal^H^H^H^Hconscious re 1st login perf. The workload is rather 
> variable all told thus the 1min delay. We usually wind up hitting this a 
> couple of times @ 1st login. That primes sqlite_stat1 sufficiently that 
> statements execute in ms (at most) as expected. .
> 
> We recently found a problem where you click on a hyperlink in Edge which 
> (among other things) queries the database and _sometimes_ takes 30s+ instead 
> of near-zero. But we couldn't repro it, and in the rare case someone hit it 
> the problem went away on its own for no apparent reason either. We finally 
> found the problem was a table involved in the query had 1 record and 
> executing the raw SQL took an abnormally long time - but ANALYZE and re-query 
> and time was near-zero. As if we'd added a row to the table for the first 
> time but didn't do ANALYZE so SQLite had no stats to inform its planning, and 
> of course if you reboot after a day or 3 the 'maintenance' would kick in and 
> update stats. Or if something else happened to get installed on the system 
> causing a new deployment operation to call ANALYZE. But the user experience 
> was poor and too unpredictable... 
> 
> PRAGMA optimize; fixed this. We get the ANALYZE benefits we need, but only 
> for those tables that grew significantly (and 0 to >0 qualifies), and 
> 
> We also cache connections for what can be lifetime of a service, so optimize 
> @ connection close is too long to wait.
> 
> 
> My quick hack is to change the PRAGMA optimize per deployment operation to a 
> best-effort - remove the busy handler, try it and restore. I suspect I need 
> something more involved but I'm still weighing my options. The big Q is 
> understanding PRAGMA optimize (and ANALYZE) in relation to busy|locked 
> conditions.
> 
> 
> What do you think of an option to only analyze tables that have grown from 0 
> records to >0 e.g. PRAGMA optimize(0x10002) so 2=ANALYZE if beneficial and 
> 0x1=change the criteria from "increased by 25 times or more since the 
> last" to "increased from 0 records to 1+ record since the last? I've seen 
> problems when we have data w/o analyze but not if we grow from N to >N, 
> thought that could be coincidental where we analyze often enough stats never 
> get too out of proportion to the data. So far the only definitive statement I 
> can make is ANALYZE on a table that went from 0 records to 1+ notably 
> improves queries by orders of magnitude.
>  - Howard
> 
> 
> -Original Message-
> From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On 
> Behalf Of Simon Slavin
> Sent: Tuesday, October 3, 2017 5:31 PM
> To: 

Re: [sqlite] PRAGMA optimize; == no busy handler?

2017-10-03 Thread Simon Slavin


On 4 Oct 2017, at 2:23am, Howard Kapustein  
wrote:

> What do you think of an option to only analyze tables that have grown from 0 
> records to >0 e.g. PRAGMA optimize(0x10002) so 2=ANALYZE if beneficial and 
> 0x1=change the criteria from "increased by 25 times or more since the 
> last" to "increased from 0 records to 1+ record since the last?

ANALYZE builds a table of statistics which tells SQLite which search strategy 
will be optimal.  The table of statistics does not have to exactly match the 
data in your tables (in fact that’s very rare).  The big changes noted by 
ANALYZE are not to do with the number of rows in each table, they are to do 
with the 'chunkiness' of each indexed column: whether a column has only two 
values (indoor / outdoor) or thousands of different values (surname).  And 
'chunkiness' doesn’t change much once a database is big enough that search-time 
matters.

It is not expected that you’ll try to run ANALYZE while a database is in use.  
It’s intended for offline-maintenance.

So why not work around the whole process ?  On your development system, create 
a database with plausible data in.  Run ANALYZE.  Then copy all the tables 
named "sqlite_stat*" into a new database.

You can then copy those tables into the databases for your production system.  
They will 'tune' SQLite to pick strategies which work best for a typical 
dataset.  You never have to run ANALYZE or "PRAGMA optimize" on that system.

Yes, it won’t give absolutely the fastest possible operations on your 
production system(s).  But they will be very close to it, and you have the 
advantage of never having to lock the database up with a maintenance procedure.

Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] PRAGMA optimize; == no busy handler?

2017-10-03 Thread Howard Kapustein
>Don't do that
Mostly sure, but there's some cases SQLite will skip the busyhandler and 
immediately return, but they're not permanently-busy conditions. We have 
multiple threads across multiple processes using the database

> why you’re doing "PRAGMA optimize" while other parts of your program are 
> accessing the database
Because we don't necessarily know there's other work in progress, and we don't 
have a good opportunity to ensure we do it later but-no-later than when we need 
it

>It makes more sense just to put ANALYZE into some sort of end-of-month 
>procedure
>or a manual maintenance procedure your users run only when nobody else is 
>using the system
That's what we did but the real world isn't that simple

Last month we did...
* ANALYZE when the staterepository service starts (effective boot time) if it 
hasn't been done in a while (1 or 3 days, I forget)
* ANALYZE at the end of a deployment operation if it hasn't been done in a 
while (1min I think)
@ 1st login there's 30+ operations in a short window (1-2 minutes?) and folks 
are very anal^H^H^H^Hconscious re 1st login perf. The workload is rather 
variable all told thus the 1min delay. We usually wind up hitting this a couple 
of times @ 1st login. That primes sqlite_stat1 sufficiently that statements 
execute in ms (at most) as expected. .

We recently found a problem where you click on a hyperlink in Edge which (among 
other things) queries the database and _sometimes_ takes 30s+ instead of 
near-zero. But we couldn't repro it, and in the rare case someone hit it the 
problem went away on its own for no apparent reason either. We finally found 
the problem was a table involved in the query had 1 record and executing the 
raw SQL took an abnormally long time - but ANALYZE and re-query and time was 
near-zero. As if we'd added a row to the table for the first time but didn't do 
ANALYZE so SQLite had no stats to inform its planning, and of course if you 
reboot after a day or 3 the 'maintenance' would kick in and update stats. Or if 
something else happened to get installed on the system causing a new deployment 
operation to call ANALYZE. But the user experience was poor and too 
unpredictable... 

PRAGMA optimize; fixed this. We get the ANALYZE benefits we need, but only for 
those tables that grew significantly (and 0 to >0 qualifies), and 

We also cache connections for what can be lifetime of a service, so optimize @ 
connection close is too long to wait.


My quick hack is to change the PRAGMA optimize per deployment operation to a 
best-effort - remove the busy handler, try it and restore. I suspect I need 
something more involved but I'm still weighing my options. The big Q is 
understanding PRAGMA optimize (and ANALYZE) in relation to busy|locked 
conditions.


What do you think of an option to only analyze tables that have grown from 0 
records to >0 e.g. PRAGMA optimize(0x10002) so 2=ANALYZE if beneficial and 
0x1=change the criteria from "increased by 25 times or more since the last" 
to "increased from 0 records to 1+ record since the last? I've seen problems 
when we have data w/o analyze but not if we grow from N to >N, thought that 
could be coincidental where we analyze often enough stats never get too out of 
proportion to the data. So far the only definitive statement I can make is 
ANALYZE on a table that went from 0 records to 1+ notably improves queries by 
orders of magnitude.

- Howard


-Original Message-
From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On 
Behalf Of Simon Slavin
Sent: Tuesday, October 3, 2017 5:31 PM
To: SQLite mailing list 
Subject: Re: [sqlite] PRAGMA optimize; == no busy handler?

On 4 Oct 2017, at 12:54am, Howard Kapustein  
wrote:

> We use the default busy-handler w/timeout=250ms and call sqlite_exec("PRAGMA 
> optimize;") in a loop until success or non-busy/locked error or some ungodly 
> amount of time elapses (~5min). The logs indicate PRAGMA optimize; fails due 
> to SQLITE_BUSY w/o ever going through the busy-handler, and then we spin 
> trying a few thousand times.

Don’t do that.  It defeats the purpose of the timeout you set.  If you want a 
longer timeout than 250ms, set that as your timeout.  SQLite’s own retry 
process is clever.  It uses exponential backoff and can get at the locks at a 
lower level than your own code.

I’ve used timeouts in my own code of 5 minutes.  And I never had a user report 
the error message they’d see if that failed.

> Docs for PRAGMA optimize; says nothing about busy|locked scenarios, nor any 
> need to call this in a transaction (nor even if that's legal). Ditto ANALYZE 
> docs say nothing about busy|locked scenarios
> 
> What's expected if PRAGMA optimize hits a busy|locked scenario?
> What am I the caller expected to do?

The only locking in SQLite is to lock the entire database.  So the initial 
question is whether you do have some other 

Re: [sqlite] PRAGMA optimize; == no busy handler?

2017-10-03 Thread Simon Slavin
On 4 Oct 2017, at 12:54am, Howard Kapustein  
wrote:

> We use the default busy-handler w/timeout=250ms and call sqlite_exec("PRAGMA 
> optimize;") in a loop until success or non-busy/locked error or some ungodly 
> amount of time elapses (~5min). The logs indicate PRAGMA optimize; fails due 
> to SQLITE_BUSY w/o ever going through the busy-handler, and then we spin 
> trying a few thousand times.

Don’t do that.  It defeats the purpose of the timeout you set.  If you want a 
longer timeout than 250ms, set that as your timeout.  SQLite’s own retry 
process is clever.  It uses exponential backoff and can get at the locks at a 
lower level than your own code.

I’ve used timeouts in my own code of 5 minutes.  And I never had a user report 
the error message they’d see if that failed.

> Docs for PRAGMA optimize; says nothing about busy|locked scenarios, nor any 
> need to call this in a transaction (nor even if that's legal). Ditto ANALYZE 
> docs say nothing about busy|locked scenarios
> 
> What's expected if PRAGMA optimize hits a busy|locked scenario?
> What am I the caller expected to do?

The only locking in SQLite is to lock the entire database.  So the initial 
question is whether you do have some other thread/process accessing the 
database.

I would ask you to reconsider why you’re doing "PRAGMA optimize" while other 
parts of your program are accessing the database. Right now it tells you only 
whether ANALYZE is needed, and there’s no harm in doing ANALYZE when it’s not 
needed.

It makes more sense just to put ANALYZE into some sort of end-of-month 
procedure or a manual maintenance procedure your users run only when nobody 
else is using the system.  That way the routine takes a more predictable amount 
of time, which users like.

Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Ordering a GROUP BY, or other concatenating tricks?

2017-10-03 Thread Simon Slavin


On 3 Oct 2017, at 11:13pm, Doug Nebeker  wrote:

> How can I select a document and get the complete sorted text back in a single 
> row (so I can do a JOIN on a different table with additional information)?  

There is a way which will probably work but the documentation adds a careful 
note that it will not always work.



So you would want something like

SELECT group_concat(LineText, '\n') FROM
(SELECT LineText FROM DocLines
WHERE DocID = 10
ORDER BY LineIndex)

The problem is that the order of concatenation is arbitrary, in other words you 
shouldn’t rely on this working.

If you don’t want to use group_concat(), do it in your programming language.  
Use

SELECT LineText FROM DocLines
WHERE DocID = 10
ORDER BY LineIndex

and concatenate the retrieved values in your programming language.

Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Ordering a GROUP BY, or other concatenating tricks?

2017-10-03 Thread Doug Nebeker
Imagine a table that holds individual lines of text documents:

CREATE TABLE DocLines
(
DocID INTEGER,
LineIndex INTEGER,
LineText TEXT
);

INSERT INTO DocLines (DocID, LineIndex, LineText) VALUES (1, 1, 'Mary had a 
little lamb');
INSERT INTO DocLines (DocID, LineIndex, LineText) VALUES (1, 2, 'It had white 
fleece');

//inserted in reverse order so insertion order doesn't happen to make the 
SELECT work

INSERT INTO DocLines (DocID, LineIndex, LineText) VALUES (2, 2, 'Humpty dumpty 
had a great fall');
INSERT INTO DocLines (DocID, LineIndex, LineText) VALUES (2, 1, 'Humpty dumpty 
sat on a wall');

How can I select a document and get the complete sorted text back in a single 
row (so I can do a JOIN on a different table with additional information)?  

The below happens to work fine for DocID 1 (because of insertion order), but 
fails for DocID 2.

SELECT group_concat(LineText, '\n') 
FROM DocLines 
WHERE DocID = 1
GROUP BY DocID;

Is there any way to order a GROUP BY, or some other way to concatenate text?


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] PRAGMA optimize; == no busy handler?

2017-10-03 Thread Howard Kapustein
Using SQLite 3.20.1 I notice a flood of log events sometimes when I call PRAGMA 
optimize;

Warning 0x5: statement aborts at 1: [PRAGMA optimize;] database is locked
And a few times
Warning 0x5: statement aborts at 2: [PRAGMA optimize;] database is locked
And even once
Warning 0x5: statement aborts at 35: [PRAGMA optimize;] database is locked

We use the default busy-handler w/timeout=250ms and call sqlite_exec("PRAGMA 
optimize;") in a loop until success or non-busy/locked error or some ungodly 
amount of time elapses (~5min). The logs indicate PRAGMA optimize; fails due to 
SQLITE_BUSY w/o ever going through the busy-handler, and then we spin trying a 
few thousand times.

Docs for PRAGMA optimize; says nothing about busy|locked scenarios, nor any 
need to call this in a transaction (nor even if that's legal). Ditto ANALYZE 
docs say nothing about busy|locked scenarios

What's expected if PRAGMA optimize hits a busy|locked scenario?
What am I the caller expected to do?


  *   Howard
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Tool for extracting deleted data from unvacuumed SQLite files

2017-10-03 Thread LincolnBurrows
If you are not a developer you can avoid having the same problem again using
Sqlite Recovery tool and  recover deleted records from Sqlite database
  . It adds
single-master replication to SQLite.



--
Sent from: http://sqlite.1065341.n5.nabble.com/
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Tool for extracting deleted data from unvacuumed SQLite files

2017-10-03 Thread LincolnBurrows
If you are not a developer you can avoid having the same problem again using
Sqlite Recovery tool and  recover deleted records from Sqlite database
  . It adds
single-master replication to SQLite.



--
Sent from: http://sqlite.1065341.n5.nabble.com/
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] pragma integrity_check throwing exception

2017-10-03 Thread Roberts, Barry (FINTL)
Hi,

Assume I have a database which is "malformed" due to rowid xxx missing from a 
table index. In the System.Data.SQLite.dll 1.0.80.0 C# driver the following 
code would return the reason, I would get a list of the rowid problems allowing 
me to log them.

private static IEnumerable IntegrityCheck(SQLiteConnection 
connection)
{
using (var command = new SQLiteCommand(connection))
{
command.CommandTimeout = 0;
command.CommandText = "pragma integrity_check";

using (var reader = command.ExecuteReader())
{
while (reader.Read())
{
yield return reader.GetString(0);
}
}
}
}

However using a later driver such as 1.0.105.1 the ExecuteReader() call throws 
an exception saying the database is malformed. That is not very helpful and 
stops me running the integrity check. If I replace the integrity_check with 
quick_check it works and returns ok, so access to the file is ok, just some 
internal indexes are messed up. Is there any way to get the above working in 
the later drivers?

Kind Regards,
Fugro Intersite.

Barry Roberts.
b.robe...@fugro.com | www.fugro.com


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users