Re: [sqlite] Concurrency Question

2019-11-23 Thread Mario M. Westphal
 

That's very helpful, thank you very much, Daniel. Also to Jens.

 

I've re-read the SQLite documentation for shared cache now. I guess I had
enabled it in the past to increase the performance (many threads in my
application).
Apparently, this had the opposite effect :-/

 

After disabling the shared cache mode, the locks no longer happen. As I had
initially anticipated, the update and the read now run happily in parallel.

 

Thanks again. On a weekend, even!

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Concurrency Question

2019-11-23 Thread Mario M. Westphal
I have an issue with concurrency. This may be the intended behavior of
SQLite. Or I'm doing something wrong.

If somebody could shed a light, I would be thankful.

 

I compile and use SQLite on Windows with VStudio.

I compile it with SQLITE_THREADSAFE=1

At runtime, I use sqlite3_open_v2 () and set the flag SQLITE_OPEN_NOMUTEX.

I use SQLite in WAL mode.

My application uses several threads. Each thread opens its own database
connection.

 

Two threads run in parallel.

Thread A does a lengthy UPDATE to table_A (prepared statement). This update
takes, say, 5 seconds.

Thread B uses a prepared statement to SELECT data from an unrelated table_B.

Thread_B is blocked seconds in sqlite3Step because lockBtreeMutex() blocks
in a mutex.

 

I did not expect this.

Why is thread_B blocked when doing a read just because SQLite is writing to
another table?

 

Is this the expected behavior or am I doing something stupid in my code. And
if so, what to check?

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SQLITE_DETERMINISTIC and custom function optimization

2019-11-09 Thread Mario M. Westphal
Thanks to all the friendly people who commented on my question. Much
appreciated :-)

 

I was able to solve this with a small trick:

 

I created a small 'state' struct with a rowid and the result (float) for
that row.

Using the "user data" parameter when creating EXPENSIVE_FUNCTION, I supply a
pointer to this struct to the function.

(I have full control over when EXPENSIVE_FUNCTION is created, dropped and
used. This trick may not work in other use cases.)

 

Then I've changed the EXPENSIVE_FUNCTION signature to also take the rowed as
the first parameter: 

 

EXPENSIVE_FUNCTION(rowid,?99,vdata) 

 

EXPENSIVE_FUNCTION uses sqlite3_user_data() to get the state struct pointer
and then compares the rowid parameter with the rowid in the struct.

If they are identical, the cached result is used. Very fast.

Else the result for the requested row is calculated and cached. This is the
slow part.

 

Thanks to this change, EXPENSIVE_FUNCTION needs to perform the slow
calculations in only 46,031 of 91,806 calls. 

In all other cases the cached value from the previous call can be used.

The runtime drops to 2,580ms (from 3,350ms) for the 45K rows sample set.
Which yields a roughly 20% better runtime. Very good.

 

I'm always amazed about what can be achieved with SQLite. Very impressive
product and API design.

 

Thank again for all who provided suggestions and commented.

 

-- Mario

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] SQLITE_DETERMINISTIC and custom function optimization

2019-11-08 Thread Mario M. Westphal
Hi all,

 

I have a table with matrices stored as blobs. Each matrix has about 800
bytes.

This table has between 20,000 and 500,000 rows.

 

I use a custom function "EXPENSIVE_FUNCTION" which performs a calculation
using a matrix supplied via sqlite3_bind_pointer() as ?99 and the matrix in
the vdata column.

I create the function using the SQLITE_DETERMINISTIC flag. My hope was that
the EXPENSIVE_FUNCTION is called only once per row. But that's not the case.

 

The query looks like this:

 

UPDATE some_table 

 

SET 

vdist = EXPENSIVE_FUNCTION(?99,vdata), 

oid = ?1, 

flags = flags | (CASE WHEN EXPENSIVE_FUNCTION(?99,vdata) < ?2 THEN ?3 ELSE 0
END)

 

WHERE

(flags & ?3) = 0  AND

(oid IS NULL AND EXPENSIVE_FUNCTION(?99,vdata) < 0.6) OR 

(EXPENSIVE_FUNCTION(?99,vdata) < vdist)

 

The EXPENSIVE_FUNCTION function is referred multiple times in the update
statement. But it always returns the same result (for any given row).

 

My stats report that SQLite calls EXPENSIVE_FUNCTION 91,806 times for a
table with 45,775 rows. 

256 rows are modified. This takes (only) 3.3 seconds.

 

The profiler tells me that sqlite3VdbeExec() spends 47% in
vdbeMemFromBtreeResize and 36% in EXPENSIVE_FUNCTION.

 

Can I change something so SQLite calls EXPENSIVE_FUNCTION only once per row?

 

Thanks in advance.

 

-- Mario

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Some FTS5 guidance

2016-01-07 Thread Mario M. Westphal
Hello,



I recently looked into FTS 5. 

The documentation is clear and I was able to get it running with a small
test database quickly. And the response times are awesome :-)



My question: 



At least as I understand it at this point, FTS can only do prefix queries.



If my database contains the words



moon

moonlight

moonshine

shine

sunshine



A FTS query like "moon*" will find all three terms starting with "moon" -
very fast.



But there is no way to find "moonshine" or "sunshine" by running a query for
"shine" or "shine*" ?



Currently I search using LIKE and there such 'contains' queries are easy. My
users of course don't understand all this and want to find all words
containing shine, wherever the term appears in the word.



The only idea I had so far was to write my own tokenizer and to store each
word with every possible 'sub-word':



When "moonshine" is added to FTS, it is split into multiple words:



moonshine
oonshine
onshine
nshine
shine
. 



(maybe I limit this to a minimum of 2 or 3 characters).



This of course produces a log of extra entries in FTS and may impact
performance and database size. 

I hence wonder if this problem has been tackled already and if there is a
"standard" solution. 



[sqlite] Query times vary between 0.2 s and 30 s for very

2015-03-24 Thread Mario M. Westphal
The temporary table is creates as



CREATE TEMPORARY TABLE _tempNNN (oid INTEGER PRIMARY KEY)



So the optimizer must know that the values are unique.







[sqlite] Is there a limit for the number of items in an IN clause?

2015-03-21 Thread Mario M. Westphal
In a recent question 
(http://mailinglists.sqlite.org/cgi-bin/mailman/private/sqlite-users/2015-March/058668.html)
 I found out that joining with a single-column temporary table with 500 rows is 
sometimes several hundred times (!) slower than using an IN clause.



So far my code switched to using a temporary table and a JOIN instead of an IN 
clause when the IN clause would contain more than 500 elements (numbers). I 
would like to use larger IN clauses if that?s possible to avoid using temporary 
tables, but I could not find a limit for how many elements I can use in IN().



[sqlite] Query times vary between 0.2 s and 30 s for very

2015-03-21 Thread Mario M. Westphal
I don?t think I can always run an analyze on the TEMP table for each query. May 
ruin performance worse than trying my luck with the temp table.



I think this boils down why a JOIN with a 500 row table (one column) is so much 
(several hundred times) slower than using an IN clause with 500 values. 

I have not changed the code or query for a long time, so my assumption is that 
a change in one of the recent SQLite updates caused the changed behavior of the 
optimizer/query engine. I recall a similar problem maybe a year ago. I think 
only one of the SQLite developers may shed some light on this.



For now I have increased the threshold for IN clauses (instead of TEMP tables) 
and use WHERE IN (SELECT ? from TEMP) instead of a JOIN.







[sqlite] Query times vary between 0.2 s and 30 s for very similar queries - how to trick the optimizer?

2015-03-18 Thread Mario M. Westphal
I?m using 3.8.8.1 on Windows via the ?C? interface.

I work with SQLite for several years with good success. And I know that no 
optimizer ever will be perfect.
But I?ve just stumbled upon a case where a very similar query requires between 
0.2 seconds and a whopping 30 seconds.

I?ve simplified things as much as possible and included the create instructions 
and queries below.

1.  My database has a table "art_data" which holds information about an 
article. A second table "attr" holds all available attributes.
art_data may contain any number of attribues per article, typically between 50 
and 200.
The art_data table in the production database has about 22 million rows, the 
attr table 20,000.

art_data
oid INTEGER PRIMARY KEY,
art_oid INTEGER,
attr_oid INTEGER,
tdata TEXT

the attr_oid refers to the table which defines the available attributes and 
tdata is the value for that attribute. The attr table is defined as:

attr
oid INTEGER PRIMARY KEY
class INTEGER
tag TEXT

For the indices created, please see below.

2.  My application needs to select all the data for a specific article and 
specific attributes. The number of attributes to select is usually between 1 
and 30 so I use an IN clause and provide the attribute ids directly in the 
SELECT query. My application has these 20,000 ids cached and always available.
If the number of articles is < 500, my application supplies a list of articles 
for the d.oid IN (...) as well.

SELECT d.oid, d.tdata FROM art_data d
WHERE d.oid IN (1,890,222,...)  
AND d.attr_oid IN (2188,2191,2251,2272,...) 
ORDER BY d.oid ASC, d.attr_oid ASC, d.rowid ASC

This query takes between 0.2 and 0.5 seconds, even if 500 article numbers are 
in the first WHERE d.oid IN clause!

If more than 500 articles are needed, I estimated that this would probably 
break the IN (is there a limit for IN?) and thus my application puts the 
article into a temporary table and JOINs with this table:

SELECT d.oid, d.attr_oid, d.tdata FROM art_data d 
INNER JOIN _temp _t ON d.oid = _t.oid  
AND d.attr_oid IN (2188,2191,2251,2272,...)
ORDER BY d.oid ASC, d.attr_oid ASC, d.rowid ASC

This query takes between 17 and 30 seconds (!) even if the temporary table only 
has 501 article numbers (one more than the threshold for the IN clause).

The only difference between 0.2 and 17 seconds is replacing an IN clause with 
500 numbers with a JOIN with a temporary table containing 501 numbers.

While playing with that, I used a SQLite GUI tool and created the temporary 
table _temp (oid INTEGER PRIMARY KEY) as a regular table and filled it with 500 
article numbers. This also resulted in the 17 to 30s query times.

For a test, I ran ANALYZE and the query time dropped down to 0.5 seconds. 
AMAZING.
Apparently the query analyzer now had the info about the (no longer) temporary 
table and was able to use it efficiently.

My SOLUTION for now was to change the query with the temporary table to

SELECT d.oid, d.attr_oid, d.tdata FROM art_data d 
WHERE d.attr_oid IN (2188,2191,2251,2272,...) 
AND d.oid IN (SELECT oid FROM _temp)

Instead of a JOIN for the temporary table I use an IN clause with a SELECT. 
This brought the query time down to 0.5 seconds as well. May also be the 
optimizer.

The question is: When JOINing large tables with a temporary table, how to 
ensure that the optimizer can work optimal? Running ANALYZE with a temporary 
table probably does not work, and ANALYZE takes about 1 minute on this database 
so this is not feasible for each query.

I'm glad to have found an apparently working solution (IN instead of JOIN) but 
I wonder if this could be somehow automated by the optimizer? Or maybe this is 
a worst-case for the optimizer?



If you want to try this out yourself, here is the complete CREATE schema and 
queries:

-- BEGIN -
DROP TABLE IF EXISTS art_data; 
DROP TABLE IF EXISTS attr;

CREATE TABLE art_data (oid INTEGER, attr_oid INTEGER, tdata TEXT, FOREIGN 
KEY(attr_oid) REFERENCES attr(oid) ON DELETE CASCADE);

CREATE INDEX idx_art_data_oid ON art_data(oid);
CREATE INDEX idx_art_data_oid_tag_oid ON art_data(oid,attr_oid);
CREATE INDEX idx_art_data_attr_oid ON art_data(attr_oid);

CREATE TABLE attr (oid INTEGER PRIMARY KEY, class INTEGER, tag TEXT);
CREATE INDEX idx_attr_tag ON attr(tag);

DROP TABLE IF EXISTS _temp;
CREATE TABLE _temp (OID INTEGER PRIMARY KEY);
--insert into _temp select ...

-- Fast: 0.2 seconds
-- explain query plan
SELECT d.oid, d.tdata FROM art_data d
-- Only for specific articles 
WHERE d.oid IN

[sqlite] PRAGMA Synchronous safety

2015-03-13 Thread Mario M. Westphal
>So would it be possible to run that command each time you open the config 
>database and after any change to it ?  That would give us a perfect way to 
>find out which commands were causing your problems.<



Not really possible. The average update rare is low, but there are times when 
hundreds of settings are written, depending on which changes the user makes in 
preferences etc. Users can update settings from custom scripts, which may mean 
one update per session or hundreds per minute. Running a 5 second integrity 
check after each write would bring down performance badly. 



I now also run an integrity_check when closing the settings database during 
application shut-down and will seek to find a way to notify the user to retain 
the log file - in the hope that it contains more info. My users are no IT 
folks, just average users, moms & pops. Displaying scary error messages about 
damaged databases and asking to send log files will cause a lot of additional 
support and probably bad reviews in social media. Database damage is a very 
sensitive area.





[sqlite] PRAGMA Synchronous safety

2015-03-12 Thread Mario M. Westphal
I dump the output to the log file so when a user sends me a log after the 
diagnosis reported a ?repaired? damage, I see one or more entries like:



?row 2481 missing from index idx_settings_sndidmnun?



I will see if I can collect more log files in the coming months.







[sqlite] PRAGMA Synchronous safety

2015-03-12 Thread Mario M. Westphal
One thing to add:



I was sometimes successful to remote-repair a corrupted database by telling the 
user how to use sqlite3.exe and calling REINDEX.



After learning that, I added this to my diagnosis routine so if 
integrity_check() returns something that?s not ?ok?, my application runs a 
REINDEX on the database and then runs integrity_check again. These cases are 
now automatically repaired and usually go unnoticed by the user. I tried to 
find a reproducible case where indices get broken or out of sync with the 
pages, but was unable to.



[sqlite] PRAGMA Synchronous safety

2015-03-12 Thread Mario M. Westphal
I have enabled the error callback and it logs everything except SQLITE_SCHEMA 
and SQLITE_OK. 



In the log files I have received so far, none of these were reported (I use 
special headers to identify SQLite errors because log files are often several 
hundred MB in size).



I?m waiting for a log file which actually containers some SQLite error info 
before the disk image corrupted error is logged. I?ve asked the users who had 
the problem once to run the diagnostics more often so we catch problems faster. 
 But often it?s just a ?Worked yesterday, today it does not start?. And then I 
get the log file and it shows ?disk image malformed? right after the database 
open procedure. 



It?s really hard to get more info about this, but I?ll try.





[sqlite] PRAGMA Synchronous safety

2015-03-12 Thread Mario M. Westphal
> Why are you using shared cache, especially with WAL?  Are your devices memory 
> constrained?



I was under the impression that shared cache has performance benefits when used 
in a MT environment. 





[sqlite] PRAGMA Synchronous safety

2015-03-12 Thread Mario M. Westphal
> If those databases are small then running "PRAGMA integrity_check" on them 
> should be very fast.



My application does that. This is how users find out that their settings 
database has become corrupted.

So far it is just reported and then the config database is restored from last 
working backup. I did not anticipate that this small database will ever become 
damaged, except in the case of blue-screens or power failures. For this a 
simple ?Config database is damaged, restoring from last backup?? message to the 
user seemed sufficient. No detailed logging is implemented at this time.







[sqlite] PRAGMA Synchronous safety

2015-03-11 Thread Mario M. Westphal
I don?t run pragma quick_check in my application ? and I don?t have any 
corrupted database here. I only ever see the log files when my application 
reports a damaged application, and there I see the ?disk image malformed? error 
message.



The diagnosis routine included in my application and which runs weekly by 
default performs logical tests on the data stored in the database, but also 
low-level tests like pragma integrity_check to ensure the SQLite database is 
physically correct. Either the damage is discovered here (in these cases I dump 
the first 100 rows returned by integrity_check into the log file) or during 
normal operation, e.g. while opening a database.







[sqlite] PRAGMA Synchronous safety

2015-03-11 Thread Mario M. Westphal
Sorry, typing on a handheld with too thick thumbs :)



The database files can be on remote storage (e.g. Windows server or NAS) (I 
know that this may be a factor and we discussed this already and I know the 
texts on your web site about false/incomplete locking etc). I just mentioned 
that here to show that using pragma quick_check is not practicable in a 
production environment when you deal with larger databases.



All recent occurrences of the problem where on local hard disks / SSD disks. It 
even affects the small configuration databases my application maintains in 
SQLite, with the FULL synch mode for maximum safety. These also use WAL / 
shared_cache but are only updated a couple of times per minute and have to 
endure much less stress by concurrent multi-thread read/write.



[sqlite] PRAGMA Synchronous safety

2015-03-11 Thread Mario M. Westphal
One process only.



[sqlite] PRAGMA Synchronous safety

2015-03-11 Thread Mario M. Westphal
The problem is that the databases usually are several gigabytes in size, which 
make them too large for the users to send them to me.

A pragma quick_check takes one to two minutes, if the database is on remove 
storage even more?



I currently only have a small 30 MB database which reports ?ok? for the pragma 
quick_check but

?row 2481 missing from index idx_settings_sndidmnun? for pragma integrity_check.



This is only a small settings database, and shows none of the disk image 
malformed problems. I can send you that DB if that helps. Let me know where to.



A while ago I had a damaged database, but I have deleted for space reasons. 
Sorry.

I will try to get a damaged database here as soon as a customer reports another 
damage.









[sqlite] PRAGMA Synchronous safety

2015-03-11 Thread Mario M. Westphal
PRAGMA quick_check



Talks ~ 120s on an average size (4 GB) database. Database is on a SSD. Cold 
database, right after open, nothing in the Windows file cache yet. Running it 
again takes about 100 seconds, not much faster. Nothing you can run very often 
or in the background because it keeps the disk busy and blocks other 
transactions and database access.







[sqlite] PRAGMA Synchronous safety

2015-03-11 Thread Mario M. Westphal
I use 3.8.8.1, source id is 2015-01-20 16:51:25 
f73337e3e289915a76ca96e7a05a1a8d4e890d55

I compile the amalgamation using Visual Studio 2012.



The options I use are:


encoding='UTF-16le';

journal_mode=WAL;

wal_autocheckpoint=2; // better bulk speed inserts

locking_mode=EXCLUSIVE;

synchronous=NORMAL;



page_size=4096;

cache_size=16384;

auto_vacuum=NONE;

temp_store=MEMORY;

foreign_keys=ON;



I implement nested transactions using checkpoints, with an outer BEGIN 
TRANSACTION.

10 or more threads may have a database connection open, but each uses its own 
sqlite3 connection created via sqlite3_open_v2().
There are typically many readers, but only one writer (BEGIN IMMEDIATE) is used 
explicitly or implicity.



Unfortunately, the database damage seems to happen in rarely used sections of 
the database file so the problem is only detected during the (usually) weekly 
diagnosis runs. I?m waiting for a ticket where the user encounters the damage 
problem and still has all the log files available. Since my application logs 
all non-OK SQLite return codes plus has an error callback, such a log file may 
provide additional info about how and when the damage happened.



[sqlite] PRAGMA Synchronous safety

2015-03-11 Thread Mario M. Westphal
I?m also concerned about this.



I have used SQLite since around 2008 with great success.



For the latest version of my application I decided to switch to using WAL mode 
and shared cache, to gain better performance.



I use pragma synchronous=NORMAL and for a highly-critical (yet small, 2 MB) 
database even synchronous=FULL.



My application is multi-threaded, but each thread uses a separate instance of 
SQLite. 

Typical database sizes are from 2 MB to 8 GB.



Recently I get an alarming large number of reports about databases with the 
?database disk image is malformed?.

Pragma integrity_check() reports a variety of errors in these cases. 



The typical scenario is: Windows 7 or 8. One user. Database on a local hard 
disk or SSD. No power failure. No blue-screen or other issue from the ?How to 
damage your SQLite database? help topic. The error just happens. It may cause 
SQLite to refuse loading the database, or it is discovered when my users run 
the routine diagnosis, which includes a pragma integrity_check() and if it 
returns anything other than ?OK? it flags the database as defect. Users then 
need to restore the database from the backup. A database once marked as defect 
will always stay in that state.



My application logs all error codes from SQLite but I have not seen anything 
unusual in the logs provided to me by customers. I have even added an error 
callback for SQLite (as per our recent discussion), to get more info about the 
error. In the log files I have seem, this handler was not called and there are 
no unusual errors. But the database corruption may be detected days after the 
actual error happened, and then the original logs are gone. I depend on my 
users to get logs and that does not work too well.



Questions:



Is there a higher risk in using WAL and/or shared_cache? In a mulit-threaded 
environment?



Is it safe to switch existing databases back from WAL to non-wall mode? Is 
there a specific workflow?



SQLite was always so reliable and I had maybe one report about a damaged 
database in 3 months (and usually it was a hard defect or a power failure). But 
now I get reports about damaged databases every week, sometimes even for new 
databases which have been created an hour ago. 



I fear that by using WAL/shared_cache with multiple SQLIte instances in 
multiple threads I somehow stress out SQLite, causing database damage under 
some conditions.

I would go back to the tried-and-true non-WAL no shared-cache mode and let a 
few hundred users test that for a while to see if this causes a drop in damaged 
databases.



Re: [sqlite] "database disk image is malformed" error occurs more (AGAIN, damage)

2015-02-03 Thread Mario M. Westphal
Had another damaged database report.

This time it is a configuration database which holds application settings. 

 

The file is stored on a local disk, not shared, and only ever accessed by my 
application and only by one thread.

The database is run in FULL sync mode for maximum security.

I’m puzzled. These files are very small, a few MB only.


The error message is

 

*** in database main ***Page 15: Rowid 3050 out of order (max larger than 
parent max of 3016)Page 128: Rowid 3017 out of order (min less than parent min 
of 3050) 

 

 

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] "database disk image is malformed" error occurs more

2015-01-31 Thread Mario M. Westphal
As I wrote above, damaged databases are replaced. No user continues working 
with a damaged database once it has been identified. The issue here is to 
detect this early and avoid it altogether.

 

> One column of one row of one table may get corrupted.  

> If that's the case then the database can be used for years 

> without any problem being noticed.

> Theoretically "PRAGMA integrity_check" will notice it, however.

 

a) As I wrote above. 

b) integrity_check must find such issues. That’s how I understand it and 
Richard told me once.

 

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] "database disk image is malformed" error occurs more

2015-01-31 Thread Mario M. Westphal
1. No client-server, I use MySQL, SQL-Server or ORACLE for that.

 

2. No access to the SQLite database ever by more than one process concurrently 
in writable mode. In readable mode, yes. But the reported damage cases were 
always single user, one PC.

 

3. I cannot prevent or disallow users to keep their databases on NAS or remote 
server storage. Telling them that keeping a file on a NAS box will probably 
damage the file would be sales venom. 

Despite, the reported cases were all databases stored on local disks, except 
one. 

I keep databases between 0.5 and 10 GB on NAS (Linux/SAMBA) and Windows servers 
for my test scenarios and perf/load tests. No troubles, no corruption.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] "database disk image is malformed" error occurs more

2015-01-30 Thread Mario M. Westphal
I estimate that over 90% of the users keep the database on local disks. I can 
tell from the log files.

Keeping the SQLite database it on a network server really hurts performance. 
That’s not what SQLite is designed for, besides all other aspects of network 
locking mentioned in various SQLite docs. I use a MySQL or other RDBMS backend 
for such scenarios.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] "database disk image is malformed" error occurs more

2015-01-30 Thread Mario M. Westphal
> Okay.  First, stop doing VACUUM after this.  You're not improving things and 
> you may be making things worse

 

Not important. If this error is encountered the database is marked and the user 
reminded on every open/close to replace it with a backup. The database is not 
supposed to be used after SQLite has reported it as corrupt.

 

When a user encounters the problem he/she restores the last working backup. I 
have a few users who faced this problem more than once. Here I always assumed 
some hardware glirch, a shaky USB connection, disk trouble, network problems 
(if the database is held on a Windows server or NAS), buggy SAMBA 
implementations or similar. Often, when asked, users ‘remembered’ a power 
failure, or some other problems with the disk or network. Case closed.

 

 

What worries me more are the incidents where users see this problem happen 
several times, with q database kept on a local hard disk or SSD. The Windows 
event log shows no reports about anything disk related. No power failure. No 
hard shut-down. No problems reading or writing data in other applications.

 

The database may be several months old or fresh. The error is sometimes 
encountered during a diagnosis run (with integrity_check) or a 
SELECT/INSERT/UPDATE suddenly returns the dreaded SQLITE_CORRUPT error code. 
This can happen for databases with 200 MB or databases with 10 GB. It 
apparently does not necessarily happen during times of high activity or bulk 
inserts. But that’s really hard to tell, because unless SQLite has to access a 
corrupted section of the file during normal operation, or integrity_check() is 
run, a damaged database may behave perfectly normal for a long time...

 

I have now implemented the ErrorCallback routine and future versions will log 
anything reported that way to the log file. Maybe this gives us some more data 
to work with. I assume that this function is safe to use in a scenario where 
multiple instances/connections of SQLite are in use in parallel? My application 
uses multiple threads, but each thread uses a separate instance of SQLite.

 

-- Mario

 

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] "database disk image is malformed" error occurs more

2015-01-30 Thread Mario M. Westphal
- The databases in question are stored on a location hard disk or SSD.

- If a user stores his database on a NAS box or Windows server, it is accessed 
directly, via standard Windows file system routines.

- From what I can tell, network-based databases are not more likely to corrupt 
than databases stored on built-in disks or SSDs or databases kept on disks or 
USB sticks connected via USB.

- My software is updated every 2 to 4 weeks, and I always include and ship with 
the latest SQLite version. 

- There is a big variance in when users update so some users may work with 
versions several months old, but not older than 2 months, typically.

- A user may access a database from multiple computers, but then only in 
read-only mode. Write access is only permitted when the database is opened in 
exclusively.

- I use SQLite since about 2008, but the code base is changed frequently. I 
maintain old databases (up to maybe one year old and use them in regression 
tests before shipping). 

 

-- Mario

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] "database disk image is malformed" error occurs more

2015-01-29 Thread Mario M. Westphal
The core code is in place since about 2008. 

 

I took advantage of changes in SQLite over time, from using the shared cache to 
switching to WAL mode for databases which are not opened in read-only mode.

These changes were made between 12 and six months ago, and tested during beta 
tests and also in the wild. 

Most database damaged errors encountered over time could be pinned to power 
failures, disk or network problems. 

But a too high number of recent reports (couple of months) could not be linked 
to any hardware problem or power failure. 

 

My application uses multiple concurrent threads, but each thread works with its 
own instance of SQLite (on the same database). Transactions are used to improve 
performance and for control 

flow. Every error returned by SQLite is logged to the log file. If a SQLite 
function returns the dreaded “disk image malformed” error, my application 
immediately stores that error and remembers it – the database is marked as 
defective and the user is notified as soon as possible.

 

My users run daily backups of all their important data, including the database 
so usually they can roll-back to the last known working backup and continue. 

 

I will implement Richard’s suggestions to gather more info to the log file. The 
next time a user reports the problem, we may get extra hints about why and when 
this happened.

 

Thanks for the great support and advice.

 

-- Mario

 

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] "database disk image is malformed" error occurs more

2015-01-29 Thread Mario M. Westphal
The diagnosis log of my application reports the output of integrity_check() 
already.

I retrieved the log from the most recent error report. This is my application 
has logged:

 

'*** IN DATABASE MAIN ***

 

ON TREE PAGE 385120 CELL 24: INVALID PAGE NUMBER 151192068

 

CORRUPTION DETECTED IN CELL 24 ON PAGE 385120

 

CORRUPTION DETECTED IN CELL 25 ON PAGE 385120

 

MULTIPLE USES FOR BYTE 1612 OF PAGE 385120

 

FRAGMENTATION OF 30 BYTES REPORTED AS 0 ON PAGE 385120'

 

 

 

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] "database disk image is malformed" error occurs more

2015-01-29 Thread Mario M. Westphal
My application does not phone home :-/ but I can add output of these functions 
to the log file my application maintains. My users know how to collect these 
log files and send them to me.

I will also add the error logging callback to my wrapper class and route it to 
the log file.

 

This should give additional information in case these errors repeat. I will do 
that right away and ship this with the next update. Will take a couple of weeks 
to saturate the user basis.

 

 

 

 

 

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] "database disk image is malformed" error occurs more

2015-01-28 Thread Mario M. Westphal
1. I don’t have the damaged databases here so I cannot run the diagnosis 
myself. The databases are usually too large to upload or transfer.

2. The SQLite version I currently use is 3.8.8.1 (complied using the Amalgation 
and Visual Studio 2012).
But since not every user always keeps up to day, older versions of SQLite are 
also in use, some maybe 3 to 4 four months old.


2. Sorry for being not more specific. With “running analysis” I meant that may 
application runs a 

PRAGMA integrity_check(100)

after running a wide range of logical checks which checks the data stored in 
the database itself. 

My diagnosis routine then runs ANALYZE for good measure and because the data in 
some of the large tables may change over time a lot. 
If an error has been found by integrity_check(), the diagnosis runs a REINDEX 
operation because this could save the database sometimes in the past.
As the final step of the diagnosis routine, my application runs a VACUUM to 
compress the database (I run the database with auto_vacum=OFF).


3.  The "disk image is malformed" error is often encountered during normal 
processing, when one or more SQLite functions return the error e.g. during 
adding large amounts of data. Since my application often works in an unattended 
mode, it records such errors, logs them into a log file and then informs the 
user at the earliest opportunity. I know that an "disk image is malformed" may 
go unnoticed for a long time, especially if the user does not run the diagnosis 
routines frequently.

3. I have read  https://www.sqlite.org/howtocorrupt.html of course. This is 
what I refer to as "rule book of how to damage your SQLite database".

-- Mario



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] "database disk image is malformed" error occurs more frequently...?

2015-01-28 Thread Mario M. Westphal
Hello,

 

I’m using SQLite in one of my applications for several years with great success.

The databases managed with SQLite are between 1 and maybe 10 GB, with about 50 
tables or so.

The platform is Windows 7 or higher.

 

Recently I get an increasing number of error reports about “database disk image 
malformed” errors from my users. These errors show up out of the blue, with 
databases held on local hard disks or even SSD’s, no power failures, Windows 
crashes or anything that’s in the rule book of “How to damage your SQLite 
database”.

 

The damage is usually detected during “diagnosis” runs. This feature runs an 
“analyze” and a” vacuum” command in order to physically validate the database 
(and to optimize and compact it).

 

Are there any settings/options I can check and which are known to increase the 
likelihood of physical database damage?

 

+ I always use the most recent version of SQLite.

+ I switched to using WAL mode during a larger update about a year ago. 

+ I use syncmode=NORMAL for a good balance between speed and security.

+ I have PRAGMA wal_autocheckpoint=2 to speed up bulk inserts (this tip 
came from drh).

+ I use nested transactions implemented via checkpoints

 

Anything I need to look for or check?

 

I was under the impression that physical damage is very unlikely and only 
happens under well-known conditions. Maybe something has changed in recent 
SQLite builds that somehow causes this to happen more often? I recall that 
physical damage was really, really rare over the past years – but now I get 
reports maybe once a week…

 

Thanks for reading and your ideas and comments.

 

-- Mario

 

 

 

 

 

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Severe performance degradation between 3.8.4.3 and

2014-08-22 Thread Mario M. Westphal
Thanks, Richard

 

After swapping back to the latest SQLite version and running an Analyze on the 
sample databases, performance is up to the same level as before (maybe even a 
bit faster). Very good.

 

I will send out a recommendation to my users to run the weekly diagnostics 
routine immediately to restore performance.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Severe performance degradation between 3.8.4.3 and

2014-08-21 Thread Mario M. Westphal
The sample database was produced by a version of my software which runs the 
previous (or even an older version of SQLite). 

My software runs an Analysis as part of a weekly database maintenance 
procedure. But the users can turn this off or delay it for weeks. Various 
versions of my software are in use, and each version links against a different 
version of SQLite. Not all users keep up with upgrades, or skip some of the 
monthly updates...

When I understand you correctly, I should/must run an ANALYSIS on existing 
databases after shipping a new version of SQLite with my application? In case 
the statistics data in existing databases causes the updated optimizer to 
choose slower execution paths? This can be arranged. 

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Severe performance degradation between 3.8.4.3 and

2014-08-21 Thread Mario M. Westphal
Hi, Richard

I have prepared a sample database, sample statements and some additional 
details and sent it to your email address.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Severe performance degradation between 3.8.4.3 and

2014-08-21 Thread Mario M. Westphal
Hi,

 

Information provided as requested.

 

_temptable is a temporary table which contains a list of oids (integer, ~ 10 
rows) to consider.

 

 

Stats3

 

tbl   idx  neqnlt   ndltsample

stack_elem idx_rel_stack_elem_soid 4  0 
 0  406

stack_elem idx_rel_stack_elem_oid   1  0
  0  4

stack_elem idx_rel_stack_elem_oid   1  1
  1  5

stack_elem idx_rel_stack_elem_oid   1  2
  2  6

stack_elem idx_rel_stack_elem_oid   1  3
  3  133

stack_elem idx_rel_stack_elem_soid_oid4  0  
0  406

 

Stats2

 

 

tbl   idx  stat

stack  idx_rel_stack_toid_rtype 210 1 1

stack_elem idx_rel_stack_elem_soid 4 4

stack_elem idx_rel_stack_elem_oid   4 1

stack_elem idx_rel_stack_elem_soid_oid4 4 1

 

 

For query 2.1

 

selectid order from  detail

1  0  0  SCAN TABLE stack_elem AS e USING 
INDEX idx_stack_elem_oid

1  1  1  SEARCH TABLE stack AS s USING 
INTEGER PRIMARY KEY (rowid=?)

1  2  2  SEARCH TABLE _temptable AS _t 
USING INTEGER PRIMARY KEY (rowid=?)

1  3  3  SEARCH TABLE _temptable AS _t2 
USING INTEGER PRIMARY KEY (rowid=?)

2  0  1  SCAN TABLE stack AS s

2  1  2  SEARCH TABLE _temptable AS _t 
USING INTEGER PRIMARY KEY (rowid=?)

2  2  0  SEARCH TABLE rel_rel AS r USING 
COVERING INDEX idx_rel_rel (moid=?)

2  0  0  USE TEMP B-TREE FOR DISTINCT

2  0  0  USE TEMP B-TREE FOR ORDER BY

0  0  0  COMPOUND SUBQUERIES 1 AND 2 (UNION)

 

For query 2.2

 

selectid order from  detail

0  0  0  SEARCH TABLE stack USING INTEGER 
PRIMARY KEY (rowid=?)

0  0  0  EXECUTE LIST SUBQUERY 0

0  0  0  SCAN TABLE stack USING COVERING 
INDEX idx_stack_toid_rtype

0  1  1  SEARCH TABLE stack_elem USING 
COVERING INDEX idx_stack_elem_soid_oid (soid=?)

 

 

 

 

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Severe performance degradation between 3.8.4.3 and 3.8.6

2014-08-20 Thread Mario M. Westphal
Hello,

After  re-compiling my Windows application (compiled with Visual C++ 2012, 32 
Bit application) with the latest SQLite version (3.8.6) I noticed a severely 
degraded performance with certain queries. The new version is 10 or more times 
slower than the previous build I used (3.8.4.3).

1.  The table schemas are:

CREATE TABLE stack (oid INTEGER PRIMARY KEY, rtype INTEGER, toid INTEGER, state 
INTEGER, color INTEGER);
CREATE INDEX idx_stack_toid_rtype ON stack(toid,rtype);


CREATE TABLE stack_elem (soid INTEGER, oid INTEGER, FOREIGN KEY(soid) 
REFERENCES stack(oid) ON DELETE CASCADE);
CREATE INDEX idx_stack_elem_oid ON stack_elem(oid);
CREATE INDEX idx_stack_elem_soid ON stack_elem(soid);
CREATE UNIQUE INDEX idx_stack_elem_soid_oid ON stack_elem(soid,oid);


2.  The queries to run are:

2.1

SELECT DISTINCT e.oid FROM stack_elem e 
INNER JOIN stack s ON e.soid = s.oid 
INNER JOIN _temptable _t ON e.oid = _t.oid  
INNER JOIN _temptable _t2 ON s.toid = _t2.oid 
WHERE s.state = ?1 AND s.toid <> e.oid 
UNION 
SELECT DISTINCT loid FROM rel_rel r 
INNER JOIN stack s ON r.moid = s.toid 
INNER JOIN _temptable _t ON s.toid = _t.oid WHERE s.rtype = ?3 AND s.state = ?1 
ORDER BY 1 

2.2

DELETE FROM stack WHERE oid IN (
SELECT stack.oid FROM stack
LEFT JOIN stack_elem ON stack_elem.soid = stack.oid AND stack_elem.oid 
<> stack.toid 
WHERE stack_elem.soid IS NULL)
AND stack.rtype =?1


Especially the 2.2 query has become an order of magnitude slower.
Reverting back to the 3.8.4.3 build immediately solves the problem and restores 
performance for the same database.

Thanks for looking into this. 
If more information or sample data is needed, let me know.


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Dramatic performance difference between different "PRAGMA synchronous" settings for bulk inserts

2014-01-20 Thread Mario M. Westphal
> Unrelated to your question, but, take a look at "external content" FTS4
table
> they dramatically cut down the amount of duplicated data [1])

Thanks for the tip. I'll definitely check that.
Currently I build the contents for FTS from several other tables, combining,
splitting, merging data via SQL as needed when INSERTing into the FTS
tables.
Maybe I can safe some of this by linking to the original data tables.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Dramatic performance difference between different "PRAGMA synchronous" settings for bulk inserts

2014-01-19 Thread Mario M. Westphal
> Unrelated to your question, but, take a look at "external content" 
> FTS4 table they dramatically cut down the amount of duplicated data 
> [1])

Thanks for the tip. I'll definitely check that.
Currently I build the contents for FTS dynamically from several other
tables, combining, splitting, merging data via SQL as needed when INSERTing
into the FTS tables. Maybe I can safe some of these efforts and reduce the
amount of data in the FTS tables.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Dramatic performance difference between different "PRAGMA synchronous" settings for bulk inserts

2014-01-19 Thread Mario M. Westphal

>
If you want to try running with synchronous=NORMAL, you might try setting
PRAGMA wal_autocheckpoint=10; (from the default of 1000) which will
make for dramatically larger WAL files, but also dramatically fewer syncs.
Then the syncs will use just 5 or 6 minutes instead of 4.5 hours. Hopefully.
<

Thanks for the tip! I will add that and combine it with synchronous=NORMAL.

The wal_autocheckpoint documentation is not that clear (IMHO) about how this
setting can impact performance.
Maybe adding a sentence to the documentation, explaining the relation
between wal_autocheckpoint, synch frequency and performance, would help
other (new) users.

If I set wal_autocheckpoint=1, I will get 1/10 of the synchs and WAL
file of about 10 MB, correct?

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Dramatic performance difference between different "PRAGMA synchronous" settings for bulk inserts

2014-01-19 Thread Mario M. Westphal
I have a performance effect which I don't quite understand.
Maybe I'm using the wrong settings or something. Sorry for the long post,
but I wanted to include all the info that may be important.
 
My software is written in C++, runs on Windows 7/8, the SQLite database file
is either on a local SATA RAID disk or a SSD.
Typical database sizes are between 2 GB and 8 GB.
The largest tables hold several million entries. Also FTS4 is used, which
also creates large tables.
Fast internal RAID disks, SDD. Four Xeon cores. 8 GB RAM.
 
I'm using SQLite 3.8.0.2
WAL mode, shared cache enabled.
locking_mode=NORMAL
checkpoint_fullfsync=0
pragma page_size=4096
pragma cache_size=16384
 
General (retrieval) performance is excellent!
 
 
During an ingest phase, my application pumps in hundreds of thousands of
records into multiple tables.
There are massive amounts of writes during that phase, different record
sizes, tables with one to four indices etc.
 
My application is multi-threaded and inserts data into the database
concurrently from multiple threads.
The threads process data in batches, and use SQLite transactions to process
all records of a batch into the database. Transactions gain a lot of speed,
which outweighs the side effects of potential blocking.
The threads monitor the execution times of the database operations and
adjust the batch size to balance speed and transaction lock duration. Slower
operations cause smaller batches, which results in shorter database locks
and better concurrency. The system adapts fairly well to system performance
and data structure.
 
The performance was not that bad, but far from good.
 
For a given set of input data (100,000 "elements"), the execution estimate
was about 5 hours.
Database on a high-speed SSD.
The largest table holds about 5 million entries afterwards.
 
***With one single change*** I improved the execution time from 5 hours down
to about 30 minutes!
 
I changed 
 
PRAGMA synchronous=NORMAL
 
to 
 
PRAGMA synchronous=OFF
 
Also all other database write operations just 'fly' now.
I'm even more impressed with SQLite than before, but I wonder why is the
change so _dramatic_ ?
 
>From the docs my impression was that using WAL mode is ideal for bulk
inserts. That wrapping large bulks of data into smaller batches, wrapped in
BEGIN/COMMT is best for performance etc. That using synchronous=NORMAL
limits the file system flush/wait operations certain really important
operations.

I logged the execution times of various operations in this phase to a text
file. Everything was fast, the processing, the INSERTs etc.
But COMMIT operations sometimes took 20s, then 0.2s, then again 10s. That's
the time SQLite spends in the execute call with "COMMIT".

Of course the amount of data written in each transaction block varied, but
in general, 85% of the total execution time of my code was spent in the
COMMIT call.
My application was the only application with measurable disk I/O at the
time. No virus checker etc.
 
Changing to synchronous=OFF made the commits 10 times faster.
 
Is this the expected behavior or am I missing something obvious?
 

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] How to detect "database disk image malformed" in advance?

2013-11-08 Thread Mario M. Westphal
> If you have a contrary example, please send me a
> copy of the database file via private email.

I take it from your reply that the integrity_check indeed should reveal
problems in the database file which cause the "disk image malformed" return
code so my logic is OK as it is.

Unfortunately, the database is at a customers site and has about 4 or 5 GB
in size. I doubt I can get this DB to you somehow. 

I asked the user to send me the log files my application automatically
writes. Now the problem looks slightly different.

The diagnosis with the "OK" result was on 11/7 but the error happened on
11/8. So there most likely where a lot of database activity in-between,
although no disk problems or other issues. If integrity_check would have
revealed problem but did not, the database seems to have become corrupted
between these two events. I'll ask him to run the diagnosis again and check
if the integrity_check reports something.

I test the result for the "OK" result and is something different is
returned, I write the first 100 rows into a log file. Maybe we see something
there.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] How to detect "database disk image malformed" in advance?

2013-11-08 Thread Mario M. Westphal

I have implemented diagnostic routines which allow my users to check the
database for problems.
If something bad happens to a database (which may be not the fault of SQLite
at all, e.g. a network problem or disk problem) I want to detect this as
early as possible in order to inform the user. This allows them to restore
their backups before too much data is lost.

The first phase in my analysis checks the data in the database for logical
problems, business logic issues etc.
After this phase has been completed, my diagnosis function runs:

1. PRAGMA integrity_check(100)
2. VACUUM
3. ANALYZE

My idea was that these routines should reveal problems reading from or
writing to the database file.

Today I had a case where my diagnosis considered a database as OK (all
SQLite functions used in the diagnosis returned SQLITE_OK), but the database
still runs into the dreaded "disk image malformed" problem shortly
afterwards :-(

My question: Is there a API function etc. which performs a reliable test of
all database pages and other structures? Something that will detect that
"disk image malformed" will be returned when some areas of the database are
updated?


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] VACUUM silently fails, database size not changed

2013-08-31 Thread Mario M. Westphal
Unfortunately this does not help :-(
 
It may have still something to do with WAL.
Before I run the VACUUM the WAL file is quite small.
After the VACUUM has completed, it is about 20 MB - about the same size as
the properly compacted database would be.
But when I run a pragma wal_checkpoint; the WAL file does not shrink and
neither the database.
When I close the database, the WAL is deleted but the database remains at
120 MB.
 
Running a vacuum with the sqlite3 command line utility afterwards shrinks
the database to 20 MB. 
 
There must be some kind of condition or lock or whatever in my app which
prevents the vacuum to work properly.
It returns success and the WAL file grows, but the database size is not
reduced. I must be missing a step
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] VACUUM silently fails, database size not changed

2013-08-31 Thread Mario M. Westphal
I run the VACUUM command at the end of a diagnosis and cleanup operation on
my database.
I use the SQLite 3 API on Windows 7. Latest version of SQLite.
My database uses the WAL mode.
 
The database size is 120 MB when I run the sqlite3_execute("VACUUM",...)
command.
After about 20 seconds of heavy disk activity, sqlite3_execute returns with
SQLITE_OK.
The databate size on disk remains unchanged although I can see the sizeof
the WAL file rise to about 20 MB.
 
I had expected that the database shrinks because a lot of data has been
removed.
I closed my application and used the command line sqlite3.exe to VACUUM it.
The database shrinks from 120 MB to 20 MB!
 
Question:
 
I checked for open transactions: None.
I checked for pending statements (with sqlite3_next_stmt()): None.
 
Are there other reasons why VACUUM runs (it takes maybe 20 seconds for it to
return, and there is high disk activity during the execute) successfully but
apparently does nothing?
 
 
 
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] VACUUM and PRAGMA temp_store

2013-06-25 Thread Mario M. Westphal
Great :-)
 
I guess that PRAGMA temp_store=MEMORY then does not add additional
performance on Windows and I can safely let it to DEFAULT or FILE.
This will avoid the excessive memory usage during VACUUM for my use case.
 
Thanks.
 
-- Mario
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] VACUUM and PRAGMA temp_store

2013-06-23 Thread Mario M. Westphal
Hello List
 
the SQLite databases I use on Windows can become fairly large (several GB).
I just noticed that running a VACCUM on such a large database (where several
of the tables are also real large) can cause excessive memory usage (more
than 2.5 GB RAM in peak).
 
I tracked this down to using  "PRAGMA temp_store=MEMORY" when opening the
database.
 
When I change this to 
 
"PRAGMA temp_store=DEFAULT" or "PRAGMA temp_store=FILE"
 
the VACUUM requires almost no RAM, even for large databases.
 
 
Question:
 
Can I change PRAGMA temp_store for an open database before I run the VACUUM?
I would set it to FIILE before and back to MEMORY afterwards.
 
Is PRAGMA temp_store useful at all when used on Windows? Or can one rely on
the Windows built-in file system cache?
 
When creating/opening a file in Windows, an application can mark this file
as "temporary" (flag: FILE_ATTRIBUTE_TEMPORARY) and Windows will try to keep
it in memory if sufficient memory is available, avoiding all writes to the
medium. Does SQLite use this feature on Windows when it creates temporary
files?
 
 
Thanks for your support.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Potential problem in SQLite 3.7.16.2

2013-05-02 Thread Mario M. Westphal
Thank you for providing this pre-release amalgamation ;-)
I downloaded it immediately and compiled it into my software.
 
The problem has been ++resolved++ and the performance is at least as good as
with previous versions of SQLite.
It even feels a bit faster, although I only could try it with a 20% sample
of the typical data volume.
 
Thank you for the excellent support. Much appreciated.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Potential problem in SQLite 3.7.16.2

2013-05-01 Thread Mario M. Westphal
Thank you for providing this pre-release amalgamation ;-)
I downloaded it immediately and compiled it into my software.
 
The problem has been ++resolved++ and the performance is at least as good as
with previous versions of SQLite.
It even feels a bit faster, although I only could try it with a 20% sample
of the typical data volume.
 
Thank you for the excellent support. Much appreciated.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Potential problem in SQLite 3.7.16.2

2013-04-30 Thread Mario M. Westphal
Hi,

I just tried this (sorry dor the delay) but apparently I'm not having the
right toolset installed.
The make file requires gawk.exe (which I downloaded from sourceforge), but
now it's complaing about a missing tclsh85...

Since the previous SQLite version works fine I think I'll skip this and wait
for the next official release. If the error still persists in that release,
I'll post again.

Thanks for your support.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Potential problem in SQLite 3.7.16.2

2013-04-22 Thread Mario M. Westphal

So far I only used the Amalgamation. Looks like the files on your source
control server require me to build SQLite from source or at least run a
tool/script to build the amalgamation.

I'm using Windows and Visual Studio so the standard Linux build tools and
scripting languages are not available.
Is there an easy way to create the amalgamation on Windows? Otherwise I will
have to change my project to try this change out. Which I can do, but it
will take some extra days before I'll find the time for that.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Potential problem in SQLite 3.7.16.2

2013-04-19 Thread Mario M. Westphal
Hi, thanks.
 
But I'm not sure that I understand you correctly.
 
Is this behavior considered as something that needs to be looked at by the
SQLite Team to restore the original performance, or is this how "it is" now
and I have to find a work-around for good (e.g. applying your suggestion
with CROSS)?  If you look into this, I stick with the older SQLite version
for now, which works perfectly for me.
 
-- Mario
 
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Potential problem in SQLite 3.7.16.2

2013-04-18 Thread Mario M. Westphal
This is a SQL Script


/* 
Application linking to SQLite using the Amalgation.
Build Tool: C++, Visual Studio 2012, Windows 7 64-Bit

The table schema and the query enclosed below are in use
for over one year and various SQLite versions.

After downloading and compiling in the SQLite 3.7.16.2, we have a problem.

The INSERT statement which takes only a few seconds with previous builds
now does not return, at least not within several minutes.
It's hard to debug the amalgation in Visual Studio but it looks like
SQLite would be caught in an internal loop inside the step() function call.

Replacing the latest version of SQLite with 3.7.15.1 (which was the one we
used before)
and re-compiling our application solves the problem. The INSERT works again
in a few seconds.

We can provide a sample database etc. on request.
*/


/* Create */

CREATE VIRTUAL TABLE md_fts_core USING
fts4(group_oid,tag_oid,file_oid,lang,data);

CREATE TABLE md_fts_core_tag (oid INTEGER PRIMARY KEY, type INTEGER);
CREATE INDEX idx_md_fts_core_tag_type ON md_fts_core_tag(type);

CREATE TABLE md_tag (oid INTEGER PRIMARY KEY, class INTEGER, group_oid
INTEGER, id TEXT, tag TEXT, idx INTEGER, dtype INTEGER, ntype TEXT, cnt
INTEGER, repeat INTEGER, flags INTEGER, FOREIGN KEY(group_oid) REFERENCES
md_tag_group(oid) ON DELETE CASCADE);
CREATE INDEX idx_md_tag_tag ON md_tag(tag);

CREATE TABLE md_tag_group (oid INTEGER PRIMARY KEY, src INTEGER, id TEXT);

CREATE TABLE md_tag_data (oid INTEGER, tag_oid INTEGER, tdata TEXT, rdata
TEXT, lang TEXT, flags INTEGER, FOREIGN KEY(tag_oid) REFERENCES md_tag(oid)
ON DELETE CASCADE);
CREATE INDEX idx_md_tag_data_flags ON md_tag_data(flags);
CREATE INDEX idx_md_tag_data_oid ON md_tag_data(oid);
CREATE INDEX idx_md_tag_data_oid_tag_oid ON md_tag_data(oid,tag_oid);
CREATE INDEX idx_md_tag_data_tag_oid ON md_tag_data(tag_oid);


/* This insert does not return (at least not within several minutes in
3.7.16.2, but takes about 5-10 seconds with build 3.7.15.1 */ 

INSERT INTO md_fts_core (group_oid,tag_oid,file_oid,lang,data) 

SELECT 
g.oid, d.tag_oid, d.oid, d.lang, d.tdata from md_tag_data d 
INNER JOIN
md_tag t ON d.tag_oid = t.oid 
INNER JOIN
md_tag_group g ON t.group_oid = g.oid 
INNER JOIN
md_fts_core_tag ct ON ct.type = 0 AND ct.oid = d.tag_oid  /* which
tags to include */ 
WHERE
d.oid IN
(1,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,3
4,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59
,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,
85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107
,108)  
AND d.tag_oid IN (2157,7309,16265,16579)  

UNION 

SELECT
/* We use group_concat to fold multiple values for one tag into one
value for FTS */
g.oid, d.tag_oid, d.oid, d.lang, group_concat(d.tdata,' ') 
FROM
md_tag_data d 
INNER JOIN
md_tag t ON d.tag_oid = t.oid 
INNER JOIN
md_tag_group g ON t.group_oid = g.oid 
INNER JOIN
md_fts_core_tag ct ON ct.type = 1 AND ct.oid = d.tag_oid /* which
tags to include */ 
WHERE
d.oid IN
(1,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,3
4,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59
,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,
85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107
,108)  
AND d.tag_oid IN (2157,7309,16265,16579)  
GROUP BY
d.tag_oid,d.oid
 

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users