from:"rob"


David,

We've been through that level of detail. Thats how we found that after 
an Analyse that a new automatic covering index was being created for a 
query that was not needed before the Analyse. That puzzled us and still 
does.


Rob

On 31 Jul 2018, at 16:58, David Raymond wrote:

Take a look at the queries being run and do an "explain query plan" 
for each and look at the output to see what it decides to use. Look 
for "SCAN TABLE" cases that might benefit, or other oddities where 
it's doing anything you're not expecting.



-Original Message-
From: sqlite-users 
[mailto:sqlite-users-boun...@mailinglists.sqlite.org] On Behalf Of Rob 
Willett

Sent: Tuesday, July 31, 2018 11:42 AM
To: SQLite mailing list
Subject: Re: [sqlite] [EXTERNAL] Re: Very, very slow commits - 
Possibly solved


Gunter,

Thanks for this. We have already started on this as we realised that 
the
COLLATE NOCASE was irrelevant and actually slowed down integer 
queries.


What we have not done is reorder the table to match the index queries.
This had not occurred to us.

We think we have already created the indexes so that we 'cover' the
queries. One advantage (!) of a 200M row table is that inefficient
queries are immediately obvious as we have to do a table scan.

We have not checked that the order of columns in the index match the
ORDER BY clauses. We never thought of that either,

Thanks for the help.

Rob

On 31 Jul 2018, at 16:21, Hick Gunter wrote:


Based on the currently available information I woudl suggest the
following schema:

CREATE TABLE IF NOT EXISTS "Disruptions" (
 "id" INTEGER PRIMARY KEY NOT NULL AUTOINCREMENT,
 "version" integer NOT NULL,
 "Disruption_id" INTEGER NOT NULL,
 "status" integer NOT NULL,
 "category" integer NOT NULL,
 "subCategory" integer NOT NULL,
 "location" integer NOT NULL,
 "levelOfInterest" integer NOT NULL,
 "corridor" integer NOT NULL,
 "severity" integer NOT NULL,
 "startTime" TEXT NOT NULL COLLATE NOCASE,
 "comments" integer NOT NULL,
 "currentUpdate" integer NOT NULL,
 "CauseAreaPointX" real NOT NULL,
 "CauseAreaPointY" real NOT NULL,
 "endTime" text NOT NULL COLLATE NOCASE,
 "remarkTime" TEXT NOT NULL COLLATE NOCASE,
 "lastModTime" TEXT NOT NULL COLLATE NOCASE,
 "Direction" TEXT COLLATE NOCASE
);

CREATE INDEX "Disruptions_Idx1" ON Disruptions
("location","corridor","status","category","severity","levelOfInterest","version","subCategory","startTime");

CREATE UNIQUE INDEX "Disruptions_Idx2" ON Disruptions
("Disruption_id","version","category","subCategory");

CREATE UNIQUE INDEX "Disruptions_Idx3" ON Disruptions
("version","Disruption_id","location");

CREATE INDEX Disruptions_Idx5 ON Disruptions
("status","Disruption_id","Severity","levelOfInterest","category","subCategory","version");

Remarks:

COLLATE is only required for text values. If you always store data of
the declared type, COLLATE has no business with a non-text column.

I have reordered the fields so that fields used in one or more 
indexes

are at the front of the record. This allows SQLite to quit decoding
the record faster when building index strings. I have assumed that
each index is used/updated equally often; if you have an index that 
is

used most, reordering the fields may help processing speed.

Non-index fields should be ordered so that fields that feature
prominently in the retrieved data come first; again, this will allow
SQLite to quit decoding the record earlier when reading data.

It may also improve performance to create a "covering index" by 
adding

the (one or two) fields retrieved to the index used for locating the
record. This allows SQLite to retrieve these fields directly from the
index BTree without referring back to the table BTree.

I assume the order of the fields of each index matches the order of
the fields in the ORDER BY clause(s) of the queries that use the
respective index.

-Ursprüngliche Nachricht-
Von: sqlite-users
[mailto:sqlite-users-boun...@mailinglists.sqlite.org] Im Auftrag von
Rob Willett
Gesendet: Dienstag, 31. Juli 2018 16:31
An: SQLite mailing list 
Betreff: [EXTERNAL] Re: [sqlite] Very, very slow commits - Possibly
solved

Dear all,

We think we have now found the issue with the slow commits.

We believe this is due to an inherent (and old) defect in our 
database

design. We think our original design has

Re: [sqlite] [EXTERNAL] Re: Very, very slow commits - Possibly solved

Gunter,

Thanks for this. We have already started on this as we realised that the
COLLATE NOCASE was irrelevant and actually slowed down integer queries.

What we have not done is reorder the table to match the index queries.
This had not occurred to us.

We think we have already created the indexes so that we 'cover' the
queries. One advantage (!) of a 200M row table is that inefficient
queries are immediately obvious as we have to do a table scan.

We have not checked that the order of columns in the index match the
ORDER BY clauses. We never thought of that either,

Thanks for the help.

Rob

On 31 Jul 2018, at 16:21, Hick Gunter wrote:

Based on the currently available information I woudl suggest the
following schema:

CREATE TABLE IF NOT EXISTS "Disruptions" (
"id" INTEGER PRIMARY KEY NOT NULL AUTOINCREMENT,
"version" integer NOT NULL,
"Disruption_id" INTEGER NOT NULL,
"status" integer NOT NULL,
"category" integer NOT NULL,
"subCategory" integer NOT NULL,
"location" integer NOT NULL,
"levelOfInterest" integer NOT NULL,
"corridor" integer NOT NULL,
"severity" integer NOT NULL,
"startTime" TEXT NOT NULL COLLATE NOCASE,
"comments" integer NOT NULL,
"currentUpdate" integer NOT NULL,
"CauseAreaPointX" real NOT NULL,
"CauseAreaPointY" real NOT NULL,
"endTime" text NOT NULL COLLATE NOCASE,
"remarkTime" TEXT NOT NULL COLLATE NOCASE,
"lastModTime" TEXT NOT NULL COLLATE NOCASE,
"Direction" TEXT COLLATE NOCASE
);

CREATE INDEX "Disruptions_Idx1" ON Disruptions
("location","corridor","status","category","severity","levelOfInterest","version","subCategory","startTime");

CREATE UNIQUE INDEX "Disruptions_Idx2" ON Disruptions
("Disruption_id","version","category","subCategory");

CREATE UNIQUE INDEX "Disruptions_Idx3" ON Disruptions
("version","Disruption_id","location");

CREATE INDEX Disruptions_Idx5 ON Disruptions
("status","Disruption_id","Severity","levelOfInterest","category","subCategory","version");

Remarks:

COLLATE is only required for text values. If you always store data of
the declared type, COLLATE has no business with a non-text column.

I have reordered the fields so that fields used in one or more indexes
are at the front of the record. This allows SQLite to quit decoding
the record faster when building index strings. I have assumed that
each index is used/updated equally often; if you have an index that is
used most, reordering the fields may help processing speed.

Non-index fields should be ordered so that fields that feature
prominently in the retrieved data come first; again, this will allow
SQLite to quit decoding the record earlier when reading data.

It may also improve performance to create a "covering index" by adding
the (one or two) fields retrieved to the index used for locating the
record. This allows SQLite to retrieve these fields directly from the
index BTree without referring back to the table BTree.

I assume the order of the fields of each index matches the order of
the fields in the ORDER BY clause(s) of the queries that use the
respective index.

-Ursprüngliche Nachricht-
Von: sqlite-users
[mailto:sqlite-users-boun...@mailinglists.sqlite.org] Im Auftrag von
Rob Willett

Gesendet: Dienstag, 31. Juli 2018 16:31
An: SQLite mailing list
Betreff: [EXTERNAL] Re: [sqlite] Very, very slow commits - Possibly
solved

Dear all,

We think we have now found the issue with the slow commits.

We believe this is due to an inherent (and old) defect in our database
design. We think our original design has an implicit ordering of rows
in a table, when the table is only increasing this flaw in the design
isn't apparent.

However when we started deduping the table AND we copied rows from one
table to another to move things around, we changed the underlying
order of rows. Sqlite handles the design change BUT the flaw in our
design becomes apparent as we keep moving the data around and data
gets mixed up. The database slows down when we create a second table
with an identical structure to the first table, copy the data into the
new table, drop the old and then when we rename the old table to the
new table, things appear to slow down. Logically speaking SQLite
shouldn't notice the difference in row order, but things do slow down,
even with analyse.

We think that a better index definition

Re: [sqlite] Very, very slow commits


Simon,

Absolutely no need to apologise. We should apologise for all the time we 
have taken from other people :(


We recognise that the collate no case is inappropriate for our database. 
We suspect this was added from a SQLite tool we used some time ago. We 
are going to use this opportunity to remove this sort of nonsense.


We still think we have an inherent design fault in our database that we 
are trying to understand. One of the problems we had was that checking 
the performance of a 200M row table for bad indexes is time consuming. 
We now have a workflow to get from a 60GB database to a 600MB database 
in a few hours. We cannot do all the work in SQL as it involves an 
external program to analyse the data but a few hours to run isn't bad. 
As we now have the database held locally, we can thrash the local server 
silly to get the performance we need.


Rob

On 31 Jul 2018, at 16:18, Simon Slavin wrote:

On 31 Jul 2018, at 2:59pm, Rob Willett  
wrote:


We've created a new table based on your ideas, moved the collate into 
the table, analysed the database. We did **not** add COLLATE NOCASE 
to the columns which are defined as integers. Would that make a 
difference?


What you did is correct.  I gave wrong advice for which I apologise.  
But I am now confused since your original code is a little strange.  
Your original has a table definition including


"version" integer NOT NULL,

but then

	CREATE UNIQUE INDEX "Disruptions_Idx2" ON Disruptions 
("Disruption_id" COLLATE NOCASE ASC, "version" COLLATE NOCASE ASC, 
"category" COLLATE NOCASE ASC, "subCategory" COLLATE NOCASE ASC);


If "version" really is an INTEGER, then it is incorrect to use version 
COLLATE NOCASE in the index.  NOCASE is purely for text values.  This 
may be slowing things down.


To solve it, in the table definition, use COLLATE NOCASE for TEXT 
columns and not for INTEGER columns.  Also, remove all mentions of 
COLLATE NOCASE in your index definitions.  Collation methods should be 
set in the table definition, not in indexes, except for some unusual 
situations.


This should increase your speed relative to your original timings.  If 
it slows things down, something else I haven't spotted is wrong.


We've found it now takes around 10% longer to do the queries than 
before.


That is understandable given the incorrect advice I gave you before.

In another post you report some strange timing problems with no simple 
explanation.  When I get such things I suspect database corruption or 
hardware problems and run an integrity_check.  But with a 60Gig 
database I might think twice.


Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Very, very slow commits - Possibly solved

 also a discussion that 
SSD's may mean that we are constantly getting file misses from the OS 
cache. I've yet to validate that theory. It could also be that something 
we do in messing with the sqlite_sequence table and that data is being 
inserted into holes somewhere.


15. We also have looked at an older piece of code and we *think* it 
makes an assumption that data is held in contiguous rows (or it could be 
that the query is poorly written and that we need to look at the 
indexes). The code isn't so obvious as to look at row_ids and work with 
that, but its a hint that were still chasing down.


16. We did notice that running analyse actually made the database 
perform worse than before. This was due to it using a specific index 
before the analyse and then afterwards it used an automatic covering 
index. We then created a real index to get it working correctly again.


17. We're now going back to our 60GB database to try and work through 
the whole process again to see if we can confirm our hypothesis.


Rob

On 31 Jul 2018, at 15:40, Chris Locke wrote:

I've been following this thread with interest, but this just doesn't 
make

sense...

 Logically speaking SQLite shouldn't notice the difference in row 
order,

but things do slow down,

even with analyse.


Are you accessing each row via its ID?  Even so, that should still be
indexed.
I thought you were simply adding records into the database - I'm 
failing to

grasp how this is slowing down in the new database.


Thanks,
Chris



On Tue, Jul 31, 2018 at 3:30 PM Rob Willett 


wrote:


Dear all,

We think we have now found the issue with the slow commits.

We believe this is due to an inherent (and old) defect in our 
database
design. We think our original design has an implicit ordering of rows 
in
a table, when the table is only increasing this flaw in the design 
isn't

apparent.

However when we started deduping the table AND we copied rows from 
one
table to another to move things around, we changed the underlying 
order

of rows. Sqlite handles the design change BUT the flaw in our design
becomes apparent as we keep moving the data around and data gets 
mixed

up. The database slows down when we create a second table with an
identical structure to the first table, copy the data into the new
table, drop the old and then when we rename the old table to the new
table, things appear to slow down. Logically speaking SQLite 
shouldn't
notice the difference in row order, but things do slow down, even 
with

analyse.

We think that a better index definition could solve the problem for 
us,

a better database design would, but thats a tricky problem.

We're now going back to our 60GB database and start from scratch to 
see

if we can create the issue (now we think we know what it is).

Thanks to everybody who contributed ideas, we appreciate the help.

Rob

On 31 Jul 2018, at 15:19, Rob Willett wrote:


Simon,

As an exercise we have just added in COLLATE NOCASE to our integer
columns.

Whoops! We thought this would make no difference but its added extra
70% to our processing speeds.

We've now got to the stage where we can make changes quickly, so 
we'll

back that change out and go back to the integer defn without COLLATE
NOCASE.

Rob

On 31 Jul 2018, at 14:59, Rob Willett wrote:


Simon,

Apologies for taking so long to get back, we've been building a 
test

system and its taken a long time.

We're just getting round to trying your ideas out to see what
difference they make,

We've created a new table based on your ideas, moved the collate 
into

the table, analysed the database. We did **not** add COLLATE NOCASE
to the columns which are defined as integers. Would that make a
difference?

We've found it now takes around 10% longer to do the queries than
before.

Rob



Please try moving your COLLATE clauses into the table definition.
e.g. instead of

CREATE UNIQUE INDEX "Disruptions_Idx3" ON Disruptions 
("version"

COLLATE NOCASE ASC, "Disruption_id" COLLATE NOCASE ASC, "location"
COLLATE NOCASE ASC);

Your table definition should have

 "version" integer NOT NULL COLLATE NOCASE,
 "Disruption_id" INTEGER NOT NULL COLLATE NOCASE,
...
 "location" integer NOT NULL COLLATE NOCASE,

and the index should be

CREATE UNIQUE INDEX "Disruptions_Idx3" ON Disruptions
("version" ASC, "Disruption_id" ASC, "location" ASC);

Once data has been entered, do ANALYZE.  This step may take a long
time.

Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinf

Re: [sqlite] Very, very slow commits - Possibly solved


Dear all,

We think we have now found the issue with the slow commits.

We believe this is due to an inherent (and old) defect in our database 
design. We think our original design has an implicit ordering of rows in 
a table, when the table is only increasing this flaw in the design isn't 
apparent.


However when we started deduping the table AND we copied rows from one 
table to another to move things around, we changed the underlying order 
of rows. Sqlite handles the design change BUT the flaw in our design 
becomes apparent as we keep moving the data around and data gets mixed 
up. The database slows down when we create a second table with an 
identical structure to the first table, copy the data into the new 
table, drop the old and then when we rename the old table to the new 
table, things appear to slow down. Logically speaking SQLite shouldn't 
notice the difference in row order, but things do slow down, even with 
analyse.


We think that a better index definition could solve the problem for us, 
a better database design would, but thats a tricky problem.


We're now going back to our 60GB database and start from scratch to see 
if we can create the issue (now we think we know what it is).


Thanks to everybody who contributed ideas, we appreciate the help.

Rob

On 31 Jul 2018, at 15:19, Rob Willett wrote:


Simon,

As an exercise we have just added in COLLATE NOCASE to our integer 
columns.


Whoops! We thought this would make no difference but its added extra 
70% to our processing speeds.


We've now got to the stage where we can make changes quickly, so we'll 
back that change out and go back to the integer defn without COLLATE 
NOCASE.


Rob

On 31 Jul 2018, at 14:59, Rob Willett wrote:


Simon,

Apologies for taking so long to get back, we've been building a test 
system and its taken a long time.


We're just getting round to trying your ideas out to see what 
difference they make,


We've created a new table based on your ideas, moved the collate into 
the table, analysed the database. We did **not** add COLLATE NOCASE 
to the columns which are defined as integers. Would that make a 
difference?


We've found it now takes around 10% longer to do the queries than 
before.


Rob


Please try moving your COLLATE clauses into the table definition.  
e.g. instead of


CREATE UNIQUE INDEX "Disruptions_Idx3" ON Disruptions ("version" 
COLLATE NOCASE ASC, "Disruption_id" COLLATE NOCASE ASC, "location" 
COLLATE NOCASE ASC);


Your table definition should have

 "version" integer NOT NULL COLLATE NOCASE,
 "Disruption_id" INTEGER NOT NULL COLLATE NOCASE,
...
 "location" integer NOT NULL COLLATE NOCASE,

and the index should be

CREATE UNIQUE INDEX "Disruptions_Idx3" ON Disruptions
("version" ASC, "Disruption_id" ASC, "location" ASC);

Once data has been entered, do ANALYZE.  This step may take a long 
time.


Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Very, very slow commits


Simon,

As an exercise we have just added in COLLATE NOCASE to our integer 
columns.


Whoops! We thought this would make no difference but its added extra 70% 
to our processing speeds.


We've now got to the stage where we can make changes quickly, so we'll 
back that change out and go back to the integer defn without COLLATE 
NOCASE.


Rob

On 31 Jul 2018, at 14:59, Rob Willett wrote:


Simon,

Apologies for taking so long to get back, we've been building a test 
system and its taken a long time.


We're just getting round to trying your ideas out to see what 
difference they make,


We've created a new table based on your ideas, moved the collate into 
the table, analysed the database. We did **not** add COLLATE NOCASE to 
the columns which are defined as integers. Would that make a 
difference?


We've found it now takes around 10% longer to do the queries than 
before.


Rob


Please try moving your COLLATE clauses into the table definition.  
e.g. instead of


CREATE UNIQUE INDEX "Disruptions_Idx3" ON Disruptions ("version" 
COLLATE NOCASE ASC, "Disruption_id" COLLATE NOCASE ASC, "location" 
COLLATE NOCASE ASC);


Your table definition should have

 "version" integer NOT NULL COLLATE NOCASE,
 "Disruption_id" INTEGER NOT NULL COLLATE NOCASE,
...
 "location" integer NOT NULL COLLATE NOCASE,

and the index should be

CREATE UNIQUE INDEX "Disruptions_Idx3" ON Disruptions
("version" ASC, "Disruption_id" ASC, "location" ASC);

Once data has been entered, do ANALYZE.  This step may take a long 
time.


Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Very, very slow commits


Simon,

Apologies for taking so long to get back, we've been building a test 
system and its taken a long time.


We're just getting round to trying your ideas out to see what difference 
they make,


We've created a new table based on your ideas, moved the collate into 
the table, analysed the database. We did **not** add COLLATE NOCASE to 
the columns which are defined as integers. Would that make a difference?


We've found it now takes around 10% longer to do the queries than 
before.


Rob


Please try moving your COLLATE clauses into the table definition.  
e.g. instead of


CREATE UNIQUE INDEX "Disruptions_Idx3" ON Disruptions ("version" 
COLLATE NOCASE ASC, "Disruption_id" COLLATE NOCASE ASC, "location" 
COLLATE NOCASE ASC);


Your table definition should have

 "version" integer NOT NULL COLLATE NOCASE,
 "Disruption_id" INTEGER NOT NULL COLLATE NOCASE,
...
 "location" integer NOT NULL COLLATE NOCASE,

and the index should be

CREATE UNIQUE INDEX "Disruptions_Idx3" ON Disruptions
("version" ASC, "Disruption_id" ASC, "location" ASC);

Once data has been entered, do ANALYZE.  This step may take a long 
time.


Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Very, very slow commits


Droedel,

Fortunately we have no performance issues in production with the 60GB 
database. These issues came out in testing (which is what testing is 
for).


We're investigating the newly generated ID's as we speak or we will be 
once we get our replica production system setup.


sqlite_analyser has been a problem for us. We've struggled to get a 
build for it.


Rob

On 30 Jul 2018, at 13:49, Droedel wrote:


Hi Rob,
Answers are in the text below

On Mon, Jul 30, 2018, at 13:45, Rob Willett wrote:

Droedel,

We don't think there are significant read access. The database is a
single database on a single thread on a single process. The only 
access
to it is a Perl script that logs the incoming information. We never 
have

two accesses at the same time.


Can you also _measure_ read access, preferably on system level on your 
production database, e.g. by using iostat ? I've seen cases where 
(other, non-SQLite) databases had unexpected disk access patterns due 
to an application error.


We have a nagging feeling (and thats all it is) about the 
autoincrement
value. We do use that feature in the table, but we have deleted so 
many


[snip]
If both databases use autoincrement, then performance should be 
similar (or at least that's what I expect). Can you easily check if 
the newly generated IDs are as expected and larger than any existing 
ID in your table ?



We did wonder if we are filling up pages in the middle or something.
However we expected the vacuum and analyse to sort this out. Now its


[snip]
sqlite3_analyzer can give some measurements, e.g. unused bytes on 
index pages.



We've built the replica test system now and we're going to have some
initial checks and get some benchmarks in place.

It could be an interesting and exciting ride :)


Sure. It's always fun learning something new. But it's less fun in 
full production when customers are yelling :-(


Regards,

Droedel




Rob


On 30 Jul 2018, at 12:32, Droedel wrote:


Hi Rob,

Is there significant read access (iostat: r/s) during these slow
writes ? If yes, it might be due to a small cache, requiring the
database to read (index) pages before updating them.

And is the data you're adding in both databases (large/small) added 
at

the end of the table using the autoincrement, or do you insert some
items in the middle ? I'm not a SQLite performance expert, but in
other databases performance can be very different because in the
former case fewer pages must be updated.

Microsoft SQL Server has something called "fill factor", basically 
the

max percentage of an index page that is used during initial fill,
which helps avoiding too many page shuffling in the index when extra
items are added. Disadvantage: it makes DBAs argue endlessly about 
the

best fill factor ;-) Maybe there's something similar possible in
SQLite but I couldn't find a pragma for this.

Oh, and do both databases have the same page size, preferably 4K ?

Regards,

Droedel



On Mon, Jul 30, 2018, at 12:51, Rob Willett wrote:

Droedel,

Thanks for the comprehensive reply. We have actually done all of
this.

The system has been running for 2-3 years and we have taken the
opportunity to try and prune the database from 60GB down to 600MB.
Currently the live system is working OK with a 60GB database, but 
our

test system (which is smaller) is struggling with 600MB.

The system has a restriction of IOPS as it's a Virtual Private
Server.
Technically it's running Ubuntu 16.04 under OpenVZ. We can get
69MB/sec
with a disk to disk copy, which isn't brilliant if we had sustained
disk
traffic which we don't.

We log approx 600 - 800 items of around 3-5K every five minutes.
These
600-800 items are mainly an insert into a single table, there are
other
things happening as well, but this is the bulk of the work. We can
see
that the -wal files grow a small amount (4-5MB) just before the
commit.
It then takes 7 seconds to execute the commit. This is the bit that
we're struggling with. We know we can get circa 70MB/sec data
throughput, so this should take a fraction of a second. Now SQLite
needs
to work out which pages to commit so thats a little longer, but we
know
SQLite is fast, so that shouldn't take 7 seconds on the small
database
as it doesn't take that long on the large 60GB database. Thats the
puzzling bit, the large database is quick, the small one slow.

We have no logging turned on, we can turn SQL logging on at the DBI
level but that turns a 20 sec run into a 2-3 minute run as it
captures
everything :) Nothing in the log files gives us any concern (apart
from
the really long commit time). Simon Slavin suggested dropping the
indexes which we did, that turned the commit into a fast commit, so
its
something to do with the indexes but we can't see what.

What we are now doing is going back to the very beginning:

1. We built a replacement system yesterday with 8GB

Re: [sqlite] Very, very slow commits


Warren,

The hardware is different, at the time we didn't want to spin up a 
complete production replica as thats quite expensive. We used a smaller 
machine, both have the same type of back end spining raid array, but we 
would think that writing 4-5MB of changed data back shouldn't take 7 
seconds. We had seen far better performance on the slower machine 
earlier in testing.


We will go back to step one and work our way through step by step from 
60GB to 600Mb as our thinking is that we have somehow screwed our 
database up.


Rob


On 30 Jul 2018, at 13:29, Warren Young wrote:

On Jul 30, 2018, at 5:53 AM, Rob Willett 
 wrote:


I would wonder why writing the data to a 60GB database and doing a 
commit is fast and writing exactly the same data to the 600MB 
database is different. The programs for doing it are the same, the 
database schema is identical.


I assume the hardware is different.  Is that not the case?

If the small DB is on a machine with a spinning disk but the large DB 
is on a machine with either an SSD or a many-spindled RAID, there’s 
your key difference.

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Very, very slow commits


Warren,

On 30 Jul 2018, at 12:28, Warren Young wrote:

On Jul 30, 2018, at 4:51 AM, Rob Willett 
 wrote:


The system has been running for 2-3 years


Has performance changed over that span?  Which direction?


Performance hasn't changed on the large 60GB data database. its pretty 
consistent.




we have taken the opportunity to try and prune the database from 60GB 
down to 600MB.


SQLite’s speed is only weakly affected by database size.  A starting 
guess is log2(N) where N is file size.


Since your data access didn’t get 10x faster from your 100x size 
drop, you can fairly guess that the speed problem isn’t due to the 
inherent time required to traverse tree-based data structures in 
SQLite.




The reason for the pruning is not for the benefit of SQLite, it's for 
our own administration. Backing up 60GB files is hard work, we're also 
struggling to fire up replicas, so we wanted to look at the database as 
a whole and get the size down. The 99% reduction is great, we were 
hoping for a 50% to 80% reduction.


The system has a restriction of IOPS as it's a Virtual Private 
Server. Technically it's running Ubuntu 16.04 under OpenVZ. We can 
get 69MB/sec with a disk to disk copy, which isn't brilliant if we 
had sustained disk traffic which we don’t.


I am sensing a spinning disk.  (Else, throughput should be a lot 
higher.)


I suspect that is the case. However 69MB/sec is adequate for the 60GB 
database and for normal usage. Our commits are fine on the 60GB 
database, just not the 600MB one.


SQLite takes data durability (the D in ACID) seriously, unlike most 
other software applications, so it is uncommon in that it flushes each 
transaction to disk before proceeding with further writes to that 
table.


A flush to disk takes a full disk rotation, and due to the way 
SQLite’s journal operates, each transaction requires two flushes.  
That means that with a 7200 RPM disk, you can get a maximum of 60 
transactions per second per table with SQLite.


I agree with your logic, but I would wonder why writing the data to a 
60GB database and doing a commit is fast and writing exactly the same 
data to the 600MB database is different. The programs for doing it are 
the same, the database schema is identical.




Sound familiar?

If I’ve guessed the problem correctly, the solutions are:

1. Batch multiple writes in a transaction.


All ready done.



2. Switch to an SSD.


Not an option in the short term for production BUT the test system we 
have setup has an SSD. This may skew the results though. Its difficult 
to get an identical system setup, but the other option is a VMWare 
Fusion system on a Mac with a spinning disk. Most of our local systems 
are SSD, we have a Mac with a spinning disk for backup.




3. Use multiple tables and/or multiple DB files.  In your case, I’d 
suggest one SQLite DB per sensor, with one thread per sensor, each of 
which keeps one of the SQLite DBs open continuously.  That way, a 
blocked DB conn won’t block any other writers.


We have one process that reads a single file in every 5 mins. No need 
for multiple databases or multiple threads.




Those solutions are given in order of ease of application and cost of 
implementation.



Nothing in the log files gives us any concern


Have you tried SQLite’s new .expert feature?


No, but we will now :)


   https://sqlite.org/cli.html#index_recommendations_sqlite_expert_


dropping a very large table is really, really, really slow.


If you put your main data table in a file of its own, you can quickly 
“drop” the table by just closing the DB and removing the DB file 
from disk.


Thats what we will do on our test system.



When you then recreate the DB file with a fresh schema, it’s 
effectively defragged/vacuumed as well.


We have a copy of the 60GB data file (took a ling time to download) 
locally now. We will use this (or rather a copy) to start the testing.


Thanks very much for the thoughtful and useful comments.

Rob


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Very, very slow commits


Droedel,

We don't think there are significant read access. The database is a 
single database on a single thread on a single process. The only access 
to it is a Perl script that logs the incoming information. We never have 
two accesses at the same time.


We have a nagging feeling (and thats all it is) about the autoincrement 
value. We do use that feature in the table, but we have deleted so many 
rows and we may have messed something up as we move data from table to 
table. We did read https://www.sqlite.org/autoinc.html a week or so ago 
and we wonder if the fact we have deleted circa 199,500,000 rows from 
the table of 200,000,000 rows we may have cocked it up somehow. We have 
never directly accessed the sqite_internal table (except to read) but 
this page makes us wonder if we have missed an error somewhere.


We did wonder if we are filling up pages in the middle or something. 
However we expected the vacuum and analyse to sort this out. Now its 
entirely possible we have had this problem before now hence we are going 
to go back to the very beginning with out 61GB database and take it step 
by step so we understand what is happening. Whats frustrating is that 
the 61GB database seems fast and the 600MB seems slow yet they both have 
the same database schema. We feel that we have made an error somewhere 
but only now discovered it.


We've built the replica test system now and we're going to have some 
initial checks and get some benchmarks in place.


It could be an interesting and exciting ride :)

Rob


On 30 Jul 2018, at 12:32, Droedel wrote:


Hi Rob,

Is there significant read access (iostat: r/s) during these slow 
writes ? If yes, it might be due to a small cache, requiring the 
database to read (index) pages before updating them.


And is the data you're adding in both databases (large/small) added at 
the end of the table using the autoincrement, or do you insert some 
items in the middle ? I'm not a SQLite performance expert, but in 
other databases performance can be very different because in the 
former case fewer pages must be updated.


Microsoft SQL Server has something called "fill factor", basically the 
max percentage of an index page that is used during initial fill, 
which helps avoiding too many page shuffling in the index when extra 
items are added. Disadvantage: it makes DBAs argue endlessly about the 
best fill factor ;-) Maybe there's something similar possible in 
SQLite but I couldn't find a pragma for this.


Oh, and do both databases have the same page size, preferably 4K ?

Regards,

Droedel



On Mon, Jul 30, 2018, at 12:51, Rob Willett wrote:

Droedel,

Thanks for the comprehensive reply. We have actually done all of 
this.


The system has been running for 2-3 years and we have taken the
opportunity to try and prune the database from 60GB down to 600MB.
Currently the live system is working OK with a 60GB database, but our
test system (which is smaller) is struggling with 600MB.

The system has a restriction of IOPS as it's a Virtual Private 
Server.
Technically it's running Ubuntu 16.04 under OpenVZ. We can get 
69MB/sec
with a disk to disk copy, which isn't brilliant if we had sustained 
disk

traffic which we don't.

We log approx 600 - 800 items of around 3-5K every five minutes. 
These
600-800 items are mainly an insert into a single table, there are 
other
things happening as well, but this is the bulk of the work. We can 
see
that the -wal files grow a small amount (4-5MB) just before the 
commit.

It then takes 7 seconds to execute the commit. This is the bit that
we're struggling with. We know we can get circa 70MB/sec data
throughput, so this should take a fraction of a second. Now SQLite 
needs
to work out which pages to commit so thats a little longer, but we 
know
SQLite is fast, so that shouldn't take 7 seconds on the small 
database

as it doesn't take that long on the large 60GB database. Thats the
puzzling bit, the large database is quick, the small one slow.

We have no logging turned on, we can turn SQL logging on at the DBI
level but that turns a 20 sec run into a 2-3 minute run as it 
captures
everything :) Nothing in the log files gives us any concern (apart 
from

the really long commit time). Simon Slavin suggested dropping the
indexes which we did, that turned the commit into a fast commit, so 
its

something to do with the indexes but we can't see what.

What we are now doing is going back to the very beginning:

1. We built a replacement system yesterday with 8GB memory and 8 
cores

and 150GB disk space. Its virtualised (ESXI) but under our control.
2. We've installed a copy of the old 60GB database on the new system.
3. We're going to benchmark the new system over a couple of thousand
runs to see what the average time is.
4. We'll then work our way through the deduping of the database step 
by
step to see when the commit time blow

Re: [sqlite] Very, very slow commits


Droedel,

Thanks for the comprehensive reply. We have actually done all of this.

The system has been running for 2-3 years and we have taken the 
opportunity to try and prune the database from 60GB down to 600MB. 
Currently the live system is working OK with a 60GB database, but our 
test system (which is smaller) is struggling with 600MB.


The system has a restriction of IOPS as it's a Virtual Private Server. 
Technically it's running Ubuntu 16.04 under OpenVZ. We can get 69MB/sec 
with a disk to disk copy, which isn't brilliant if we had sustained disk 
traffic which we don't.


We log approx 600 - 800 items of around 3-5K every five minutes. These 
600-800 items are mainly an insert into a single table, there are other 
things happening as well, but this is the bulk of the work. We can see 
that the -wal files grow a small amount (4-5MB) just before the commit. 
It then takes 7 seconds to execute the commit. This is the bit that 
we're struggling with. We know we can get circa 70MB/sec data 
throughput, so this should take a fraction of a second. Now SQLite needs 
to work out which pages to commit so thats a little longer, but we know 
SQLite is fast, so that shouldn't take 7 seconds on the small database 
as it doesn't take that long on the large 60GB database. Thats the 
puzzling bit, the large database is quick, the small one slow.


We have no logging turned on, we can turn SQL logging on at the DBI 
level but that turns a 20 sec run into a 2-3 minute run as it captures 
everything :) Nothing in the log files gives us any concern (apart from 
the really long commit time). Simon Slavin suggested dropping the 
indexes which we did, that turned the commit into a fast commit, so its 
something to do with the indexes but we can't see what.


What we are now doing is going back to the very beginning:

1. We built a replacement system yesterday with 8GB memory and 8 cores 
and 150GB disk space. Its virtualised (ESXI) but under our control.

2. We've installed a copy of the old 60GB database on the new system.
3. We're going to benchmark the new system over a couple of thousand 
runs to see what the average time is.
4. We'll then work our way through the deduping of the database step by 
step to see when the commit time blow up. This will take a few days as 
working out the duplications of 200,000,000 rows isn't that quick :) As 
we found out, dropping a very large table is really, really, really 
slow.
5. We'll apply some of the ideas that people have suggested since 
yesterday to see if they work, but I'm keen that we have a repeatable 
problem that we solve rather than we use a scatter gun approach to 
fixing it. We think SQLite is well written so we figure the problem is 
ours to solve rather than simply blaming the software.



Thanks

Rob

On 30 Jul 2018, at 11:11, Droedel wrote:


Hi,

When having bad performance, I usually first try to find out if the 
slowness is due to disk througput (sequential), slow random access or 
something else. In Linux, try "iostat -xtc 5". Do this with and 
without your application writing to disk.


If you see high CPU %iowait and high %util on your disk, then disk is 
the bottleneck. If not: start profiling / analyzing other bottlenecks 
(CPU / network / ...)


If the disk throughput (wMB/s) is close to your normal sequential 
throughput (69 MB/s): try to write less data or get a faster disk.
If the disk troughput is low, but high numbers of writes (w/s): 
there's too much seeking / too many small writes to your disk. Page 
cache too small ? Checkpointing too often ?


Sometimes this kind of problems is caused by other applications 
(logging / ...) causing too much baseload. %util should be low when 
your application isn't running.


Just my 2 cents.

Kind regards,

Droedel


On Sun, Jul 29, 2018, at 10:14, Rob Willett wrote:

Hi,

Background

We've been refactoring our database to reduce the size of it. Through
some simple logic we've managed to pull out 99% of the data to reduce
the size from 51GB down to approx 600MB. This logic has been to 
remove

rows that are almost the same but not quite identical. As with all
things, the thinking was the difficult bit, the execution somewhat
easier.

As part of the testing we've been doing, we've now hit on an odd and
weird problem to do with the COMMIT statement. A commit of a few 
hundred
(circa 600-800) rows takes approx 7 seconds whereas before we never 
even

noticed it, though we now know it was two seconds before. Each row is
probably 1-2K of data, so its not very much at all.

Details of what we have tried:

1. We've turned synchronous  on and off

PRAGMA synchronous=ON

and thats not made any difference.

2. We are using and have been using WAL mode for years.

PRAGMA journal_mode;
journal_mode
wal

3. We've tested that the server copies OK, we get a consistent 
69MB/sec.

This is not as fast we

Re: [sqlite] Very, very slow commits


Thanks for the mail.

We ran analyze with no indexes, made no difference.

We recreated the indexes and ran analyze again. The very long commit is 
back, this time it took 14 secs :)


It does appear that the indexes have something to do with this whereby 
they have not been an issue to now.


Rob

On 29 Jul 2018, at 11:45, J. King wrote:

On July 29, 2018 5:47:29 AM EDT, Rob Willett 
 wrote:

John,

Thanks for the prompt reply and a very good question..

We've dropped the indexes and the commit is now very quick, approx 
two

seconds

However the overall performance of the run is much the same as other
areas of the code are now significantly slower, whereas before they
were
quick.

Where were you going with that question?


Might ANALYZE help?
--
J. King
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Very, very slow commits


Anton,

Dropped the indexes and created them without order.

We'll need to look at what your second para means. It could be a major 
and massive change.


Rob



On 29 Jul 2018, at 11:52, Djelf wrote:


Rob,

Try creating indexes without order.

Or, try to make a column with a hash of the values entering the index 
and
search for the value by the index of. This will complicate the logic 
of your

program, but it will decrease the volume of the database, and possibly
significantly speed up both reading and writing.

---
Anton Azanov



--
Sent from: http://sqlite.1065341.n5.nabble.com/
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Very, very slow commits


John,

Thanks for the prompt reply and a very good question..

We've dropped the indexes and the commit is now very quick, approx two 
seconds


However the overall performance of the run is much the same as other 
areas of the code are now significantly slower, whereas before they were 
quick.


Where were you going with that question?

Thanks

Rob

On 29 Jul 2018, at 10:33, John Found wrote:


What is the performance without the indexes?

On Sun, 29 Jul 2018 10:20:11 +0100
"Rob Willett"  wrote:


Update 1

We've copied the disruptions table to a new table, dropped the old
table, copied the new table back in and recreated all the indexes.

Exactly the sam commit performance.

We've also tracked the -shm and -wal files and they are around 5MB in
size.

Mmmm tricky (as Deepthought said).

Rob

On 29 Jul 2018, at 9:14, Rob Willett wrote:


Hi,

Background

We've been refactoring our database to reduce the size of it. 
Through
some simple logic we've managed to pull out 99% of the data to 
reduce
the size from 51GB down to approx 600MB. This logic has been to 
remove

rows that are almost the same but not quite identical. As with all
things, the thinking was the difficult bit, the execution somewhat
easier.

As part of the testing we've been doing, we've now hit on an odd and
weird problem to do with the COMMIT statement. A commit of a few
hundred (circa 600-800) rows takes approx 7 seconds whereas before 
we

never even noticed it, though we now know it was two seconds before.
Each row is probably 1-2K of data, so its not very much at all.

Details of what we have tried:

1. We've turned synchronous  on and off

PRAGMA synchronous=ON

and thats not made any difference.

2. We are using and have been using WAL mode for years.

PRAGMA journal_mode;
journal_mode
wal

3. We've tested that the server copies OK, we get a consistent
69MB/sec. This is not as fast we would like, but it's the same 
across

all our virtual servers.

4. We've tested the commit on our existing 60GB database and it 
takes
2 seconds, which is longer than we thought it would be. The server 
for

the 60GB database is a large VPS with 8GB/8 cores and runs Ubuntu
14.04. The server we are testing on is a 2GB/2 core test server
running Ubuntu 16.04. Whilst the test server is smaller, we wouldn't
expect it to take 3 times longer to do a commit.

5. The code is identical across the servers. We are running Perl and
the DBI module. The code for doing a commit in Perl::DBI is
 $dbh->do("COMMIT");

  We are getting the expected performance elsewhere on the system 
and

in the code. It's just the commit that is taking a long time.

6. The code we are committing is adding 600-800 lines to a table 
that

used to be 200,000,000 rows in size. It's now 400,000 lines in size.
We are wondering if the deletion of the lines has had an impact we
didn't expect. We have vacuumed and analysed the database.

The schema for the table we insert into is

CREATE TABLE IF NOT EXISTS "Disruptions" (
 "id" INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT,
 "version" integer NOT NULL,
 "Disruption_id" INTEGER NOT NULL,
 "status" integer NOT NULL,
 "severity" integer NOT NULL,
 "levelOfInterest" integer NOT NULL,
 "category" integer NOT NULL,
 "subCategory" integer NOT NULL,
 "startTime" TEXT NOT NULL,
 "endTime" text NOT NULL,
 "location" integer NOT NULL,
 "corridor" integer NOT NULL,
 "comments" integer NOT NULL,
 "currentUpdate" integer NOT NULL,
 "remarkTime" TEXT NOT NULL,
 "lastModTime" TEXT NOT NULL,
 "CauseAreaPointX" real NOT NULL,
 "CauseAreaPointY" real NOT NULL,
 "Direction" TEXT
);
CREATE INDEX "Disruptions_Idx1" ON Disruptions ("location" COLLATE
NOCASE ASC, "corridor" COLLATE NOCASE ASC, "status" COLLATE NOCASE
ASC, "category" COLLATE NOCASE ASC, "severity" COLLATE NOCASE ASC,
"levelOfInterest" COLLATE NOCASE ASC, "version" COLLATE NOCASE ASC,
"subCategory" COLLATE NOCASE ASC, "startTime" COLLATE NOCASE ASC);
CREATE UNIQUE INDEX "Disruptions_Idx2" ON Disruptions 
("Disruption_id"

COLLATE NOCASE ASC, "version" COLLATE NOCASE ASC, "category" COLLATE
NOCASE ASC, "subCategory" COLLATE NOCASE ASC);
CREATE UNIQUE INDEX "Disruptions_Idx3" ON Disruptions ("version"
COLLATE NOCASE ASC, "Disruption_id" COLLATE NOCASE ASC, "location"
COLLATE NOCASE ASC);
CREATE INDEX Disruptions_Idx5 ON Disruptions ("status",
"Disruption_id&q

Re: [sqlite] Very, very slow commits


Update 1

We've copied the disruptions table to a new table, dropped the old 
table, copied the new table back in and recreated all the indexes.


Exactly the sam commit performance.

We've also tracked the -shm and -wal files and they are around 5MB in 
size.


Mmmm tricky (as Deepthought said).

Rob

On 29 Jul 2018, at 9:14, Rob Willett wrote:


Hi,

Background

We've been refactoring our database to reduce the size of it. Through 
some simple logic we've managed to pull out 99% of the data to reduce 
the size from 51GB down to approx 600MB. This logic has been to remove 
rows that are almost the same but not quite identical. As with all 
things, the thinking was the difficult bit, the execution somewhat 
easier.


As part of the testing we've been doing, we've now hit on an odd and 
weird problem to do with the COMMIT statement. A commit of a few 
hundred (circa 600-800) rows takes approx 7 seconds whereas before we 
never even noticed it, though we now know it was two seconds before. 
Each row is probably 1-2K of data, so its not very much at all.


Details of what we have tried:

1. We've turned synchronous  on and off

PRAGMA synchronous=ON

and thats not made any difference.

2. We are using and have been using WAL mode for years.

PRAGMA journal_mode;
journal_mode
wal

3. We've tested that the server copies OK, we get a consistent 
69MB/sec. This is not as fast we would like, but it's the same across 
all our virtual servers.


4. We've tested the commit on our existing 60GB database and it takes 
2 seconds, which is longer than we thought it would be. The server for 
the 60GB database is a large VPS with 8GB/8 cores and runs Ubuntu 
14.04. The server we are testing on is a 2GB/2 core test server 
running Ubuntu 16.04. Whilst the test server is smaller, we wouldn't 
expect it to take 3 times longer to do a commit.


5. The code is identical across the servers. We are running Perl and 
the DBI module. The code for doing a commit in Perl::DBI is

 $dbh->do("COMMIT");

  We are getting the expected performance elsewhere on the system and 
in the code. It's just the commit that is taking a long time.


6. The code we are committing is adding 600-800 lines to a table that 
used to be 200,000,000 rows in size. It's now 400,000 lines in size. 
We are wondering if the deletion of the lines has had an impact we 
didn't expect. We have vacuumed and analysed the database.


The schema for the table we insert into is

CREATE TABLE IF NOT EXISTS "Disruptions" (
 "id" INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT,
 "version" integer NOT NULL,
 "Disruption_id" INTEGER NOT NULL,
 "status" integer NOT NULL,
 "severity" integer NOT NULL,
 "levelOfInterest" integer NOT NULL,
 "category" integer NOT NULL,
 "subCategory" integer NOT NULL,
 "startTime" TEXT NOT NULL,
 "endTime" text NOT NULL,
 "location" integer NOT NULL,
 "corridor" integer NOT NULL,
 "comments" integer NOT NULL,
 "currentUpdate" integer NOT NULL,
 "remarkTime" TEXT NOT NULL,
 "lastModTime" TEXT NOT NULL,
 "CauseAreaPointX" real NOT NULL,
 "CauseAreaPointY" real NOT NULL,
 "Direction" TEXT
);
CREATE INDEX "Disruptions_Idx1" ON Disruptions ("location" COLLATE 
NOCASE ASC, "corridor" COLLATE NOCASE ASC, "status" COLLATE NOCASE 
ASC, "category" COLLATE NOCASE ASC, "severity" COLLATE NOCASE ASC, 
"levelOfInterest" COLLATE NOCASE ASC, "version" COLLATE NOCASE ASC, 
"subCategory" COLLATE NOCASE ASC, "startTime" COLLATE NOCASE ASC);
CREATE UNIQUE INDEX "Disruptions_Idx2" ON Disruptions ("Disruption_id" 
COLLATE NOCASE ASC, "version" COLLATE NOCASE ASC, "category" COLLATE 
NOCASE ASC, "subCategory" COLLATE NOCASE ASC);
CREATE UNIQUE INDEX "Disruptions_Idx3" ON Disruptions ("version" 
COLLATE NOCASE ASC, "Disruption_id" COLLATE NOCASE ASC, "location" 
COLLATE NOCASE ASC);
CREATE INDEX Disruptions_Idx5 ON Disruptions ("status", 
"Disruption_id", "Severity", "levelOfInterest", "category", 
"subCategory", "version");


We have checked that this schema is consistent across the databases.

We're about to recreate the table to see if that makes a difference.

Any help or advice welcomed.

Thanks

Rob
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] Very, very slow commits


Hi,

Background

We've been refactoring our database to reduce the size of it. Through 
some simple logic we've managed to pull out 99% of the data to reduce 
the size from 51GB down to approx 600MB. This logic has been to remove 
rows that are almost the same but not quite identical. As with all 
things, the thinking was the difficult bit, the execution somewhat 
easier.


As part of the testing we've been doing, we've now hit on an odd and 
weird problem to do with the COMMIT statement. A commit of a few hundred 
(circa 600-800) rows takes approx 7 seconds whereas before we never even 
noticed it, though we now know it was two seconds before. Each row is 
probably 1-2K of data, so its not very much at all.


Details of what we have tried:

1. We've turned synchronous  on and off

PRAGMA synchronous=ON

and thats not made any difference.

2. We are using and have been using WAL mode for years.

PRAGMA journal_mode;
journal_mode
wal

3. We've tested that the server copies OK, we get a consistent 69MB/sec. 
This is not as fast we would like, but it's the same across all our 
virtual servers.


4. We've tested the commit on our existing 60GB database and it takes 2 
seconds, which is longer than we thought it would be. The server for the 
60GB database is a large VPS with 8GB/8 cores and runs Ubuntu 14.04. The 
server we are testing on is a 2GB/2 core test server running Ubuntu 
16.04. Whilst the test server is smaller, we wouldn't expect it to take 
3 times longer to do a commit.


5. The code is identical across the servers. We are running Perl and the 
DBI module. The code for doing a commit in Perl::DBI is

 $dbh->do("COMMIT");

  We are getting the expected performance elsewhere on the system and 
in the code. It's just the commit that is taking a long time.


6. The code we are committing is adding 600-800 lines to a table that 
used to be 200,000,000 rows in size. It's now 400,000 lines in size. We 
are wondering if the deletion of the lines has had an impact we didn't 
expect. We have vacuumed and analysed the database.


The schema for the table we insert into is

CREATE TABLE IF NOT EXISTS "Disruptions" (
 "id" INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT,
 "version" integer NOT NULL,
 "Disruption_id" INTEGER NOT NULL,
 "status" integer NOT NULL,
 "severity" integer NOT NULL,
 "levelOfInterest" integer NOT NULL,
 "category" integer NOT NULL,
 "subCategory" integer NOT NULL,
 "startTime" TEXT NOT NULL,
 "endTime" text NOT NULL,
 "location" integer NOT NULL,
 "corridor" integer NOT NULL,
 "comments" integer NOT NULL,
 "currentUpdate" integer NOT NULL,
 "remarkTime" TEXT NOT NULL,
 "lastModTime" TEXT NOT NULL,
 "CauseAreaPointX" real NOT NULL,
 "CauseAreaPointY" real NOT NULL,
 "Direction" TEXT
);
CREATE INDEX "Disruptions_Idx1" ON Disruptions ("location" COLLATE 
NOCASE ASC, "corridor" COLLATE NOCASE ASC, "status" COLLATE NOCASE ASC, 
"category" COLLATE NOCASE ASC, "severity" COLLATE NOCASE ASC, 
"levelOfInterest" COLLATE NOCASE ASC, "version" COLLATE NOCASE ASC, 
"subCategory" COLLATE NOCASE ASC, "startTime" COLLATE NOCASE ASC);
CREATE UNIQUE INDEX "Disruptions_Idx2" ON Disruptions ("Disruption_id" 
COLLATE NOCASE ASC, "version" COLLATE NOCASE ASC, "category" COLLATE 
NOCASE ASC, "subCategory" COLLATE NOCASE ASC);
CREATE UNIQUE INDEX "Disruptions_Idx3" ON Disruptions ("version" COLLATE 
NOCASE ASC, "Disruption_id" COLLATE NOCASE ASC, "location" COLLATE 
NOCASE ASC);
CREATE INDEX Disruptions_Idx5 ON Disruptions ("status", "Disruption_id", 
"Severity", "levelOfInterest", "category", "subCategory", "version");


We have checked that this schema is consistent across the databases.

We're about to recreate the table to see if that makes a difference.

Any help or advice welcomed.

Thanks

Rob
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Any ideas on how long it takes to drop a large table...rather an important question now...


Simon,

No, we knew about normalisation, the database is normalised, that part 
of the design hasn't changed in years.


The reasons for the massive reduction in database size is that we worked 
out how to handle repeating lines of data that change just enough that 
we thought they needed to be kept. With some small tweaks we could 
eliminate data that we thought we needed to preserve. We had assumed 
that we needed to keep everything, but by going back in, looking at what 
actual data we now had, we worked out we could store a small amount of 
extra information that stores some state change but this meant we could 
dump most of the database.


It was something that we hadn't appreciated a few years ago, and the 
ever increasing database meant we had to do something about the DB size.


Its taken a few months of work, mainly in a dark room with a wet towel 
on our foreheads, to do the tiny changes necessary to make a big 
difference.


In hindsight the fact that SQLite is so easy and fast to use was a 
slight disadvantage to us, it allowed us to be a bit quick and dirty 
with designs, when we should have thought through some of the issues. 
However thats what startups are like ;)


I think all the explanations to date have been helpful and appreciate 
the time take to answer, we're going to think a little more carefully 
about how we manage our database on a VM. I'm in the process of moving 
home so the only real hardware (of any note) is sitting in storage so 
the only testing we can do is either virtualised or on Mac laptops. 
Neither of which will help us in this instance.


Rob

On 17 Jul 2018, at 13:16, Simon Slavin wrote:


On 17 Jul 2018, at 1:10pm, R Smith  wrote:

What kind of data did you store (maybe some examples if possible) 
that you could condense it by ~99% like that?


I think that the OP's organisation discovered the 'relational' part of 
RDBMS and implemented normalisation.


To Rob Willett: Ryan Smith's explanation is better than mine.  Please 
read his post and ignore my wooly one.  I tried three ways to get 
across the sequential-access vs. random-access point and wasn't really 
happy with anything I wrote.


Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Any ideas on how long it takes to drop a large table...rather an important question now...


Dan,

Thanks for the reply.

pragma secure_delete;
1
sqlite>

Which is a bit of a surprise as we have never seen it before. We had to 
look it up using your reference.


No idea why that is set (if 1 means it is).

Should we simply ctrl-C the deletion, turn secure_delete off and then 
try again? My working assumption is that since there is a -wal file we 
are safe to do this.


Rob

On 16 Jul 2018, at 21:07, Dan Kennedy wrote:


On 07/17/2018 03:01 AM, rob.sql...@robertwillett.com wrote:

Hi,

We're doing a massive tidy on our database which is approx 50GB.

One table is approx 49GB of that 50GB which we need to delete as we 
have

recorded the information in a far better format thats approx 99% more
efficient. If only we had been this clever when we started

We've just 'dropped' the table and were assuming that dropping the 
table

would be quite quick. It's not. So far we've been waiting for 30 mins
and nothing has come back yet. We can see that the -wal file is upto
2.5GB. We have this terrible feeling that it'll need to get to 49GB 
or
so before the table gets dropped. We can just about handle that in 
the

current filesystem.

We're now getting nervous about dropping this table. We had assumed 
that

it would be a really quick and easy operation based on absolutely no
checking whatsoever. When we looked on line all we could see was a
reference to a very, very old and outdated page
(https://sqlite.org/speed.html) which talks about speed and at the
bottom of that page the comments

"SQLite is slower than the other databases when it comes to dropping
tables. This probably is because when SQLite drops a table, it has to 
go
through and erase the records in the database file that deal with 
that

table. MySQL and PostgreSQL, on the other hand, use separate files to
represent each table so they can drop a table simply by deleting a 
file,

which is much faster.

On the other hand, dropping tables is not a very common operation so 
if

SQLite takes a little longer, that is not seen as a big problem."

Is this still the case, is it going to take a long time? If we assume
that the table is 49GB then will we need to wait until the -wal file 
is

at 49GB. By our estimates thats approximately 10 hours away.

Any help or idea or suggestions welcomed, but please be quick.


Is secure-delete turned on?

  https://www.sqlite.org/pragma.html#pragma_secure_delete

Dan.

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Any ideas on how long it takes to drop a large table...rather an important question now...


Simon,

No, we knew about normalisation, the database is normalised, that part 
of the design hasn't changed in years.


The reasons for the massive reduction in database size is that we worked 
out how to handle repeating lines of data that change just enough that 
we thought they needed to be kept. With some small tweaks we could 
eliminate data that we thought we needed to preserve. We had assumed 
that we needed to keep everything, but by going back in, looking at what 
actual data we now had, we worked out we could store a small amount of 
extra information that stores some state change but this meant we could 
dump most of the database.


It was something that we hadn't appreciated a few years ago, and the 
ever increasing database meant we had to do something about the DB size.


Its taken a few months of work, mainly in a dark room with a wet towel 
on our foreheads, to do the tiny changes necessary to make a big 
difference.


In hindsight the fact that SQLite is so easy and fast to use was a 
slight disadvantage to us, it allowed us to be a bit quick and dirty 
with designs, when we should have thought through some of the issues. 
However thats what startups are like ;)


I think all the explanations to date have been helpful and appreciate 
the time take to answer, we're going to think a little more carefully 
about how we manage our database on a VM. I'm in the process of moving 
home so the only real hardware (of any note) is sitting in storage so 
the only testing we can do is either virtualised or on Mac laptops. 
Neither of which will help us in this instance.


Rob

On 17 Jul 2018, at 13:16, Simon Slavin wrote:


On 17 Jul 2018, at 1:10pm, R Smith  wrote:

What kind of data did you store (maybe some examples if possible) 
that you could condense it by ~99% like that?


I think that the OP's organisation discovered the 'relational' part of 
RDBMS and implemented normalisation.


To Rob Willett: Ryan Smith's explanation is better than mine.  Please 
read his post and ignore my wooly one.  I tried three ways to get 
across the sequential-access vs. random-access point and wasn't really 
happy with anything I wrote.


Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Any ideas on how long it takes to drop a large table...rather an important question now...


Simon,

Thanks for this. You make some interesting points about cache hits and 
misses. Up until now, we hadn't seen a massive performance hit, whilst 
some database operations didn't work as fast as we would like them to, 
nothing works as fast we would like.


I'm not familiar with these issues with virtualisation. The VPI we use 
has OpenVZ at it's core (no pun intended). We can see a little on the 
internet about this (well one article specifically about MySQL and 
OpenVZ) but nothing else.


We are in the process of evaluating whether to move to a different VPS, 
some use OpenVZ, some use KVM, internally we use VMWare ESXI. Very 
little real DB work is done on the actual metal, its all virtualised.


Do you have any pointers to stuff we can read up on? We don't understand 
your comment "SQLite spent that whole time accessing your 50GB database 
file in an apparently random order." and would like to try and get more 
information about it.


We have the option of moving off OpenVZ to KVM or ESXI so if we can 
understand the issue, we can make a more informed choice. Whilst our DB 
has dropped down to 500MB we still need to do a fair amount of testing 
and checking to make sure there are no unusual edge cases (or bugs) 
based before we promote it to live.


Many thanks

Rob

On 17 Jul 2018, at 12:05, Simon Slavin wrote:

On 17 Jul 2018, at 8:37am, Rob Willett  
wrote:


I suspect that part of the issue is the VPS provider we use has a 
rate limiter on IOPS which is not normally an issue for us, but that 
might have slowed it down somewhat. However I don't think that it 
would have slowed it down by hours.


Actually I think VPS had a lot to do with the time the operation took. 
 Any kind of virtual machine takes a terrible hit during the sort of 
storage access involved in dropping the table.


SQLite spent that whole time accessing your 50GB database file in an 
apparently random order.   So you had nine hours of cache misses, 
causing the virtual machine to continually write virtual pages back to 
real storage and read other pages into memory.  Virtual systems are 
optimized for cache hits, not cache misses.


I can't prove it without a lot of pointless data manipulation on your 
type of VPS, but I think you found its least optimal operation.  The 
good part is that now your database is less than 1GB long you're going 
to see a massive increase in speed since the whole database may well 
fit in the cache of your virtual machine.


Must remember in future, when people report unusually slow operations, 
to ask whether they're using a virtual machine or real hardware.


Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Any ideas on how long it takes to drop a large table...rather an important question now...

The top table finally finished around 07:00 this morning. It took 9.5 
hours to drop the circa 190M rows.


I suspect that part of the issue is the VPS provider we use has a rate 
limiter on IOPS which is not normally an issue for us, but that might 
have slowed it down somewhat. However I don't think that it would have 
slowed it down by hours.


Lessons to be learnt here:

1. Design your database correctly at the start. This was our biggest 
problem, we failed to understand how large a single table would grow and 
didn't manage the administration correctly and in good time.
2. Think carefully about the data you actually need. We managed to throw 
away 99% of our data and combined it together to get a 50GB database 
down to 500MB.
3. Dropping a table may not be the fastest method, creating all the 
other smaller tables in a separate database would have been far quicker. 
There's always more than one way to solve a problem.

4. Turn off secure_delete if it's on.
5. Use the mailing group for advice. Its brilliant!
6. Assess, plan, act. Thats a diving phrase but very pertinent here, 
assess what the problem actually is before you plan and execute.


Now we have other checks to do to assure us the database is accurate for 
our needs.


Thanks to all the people who offered advice and help

Rob

On 17 Jul 2018, at 7:02, Rob Willett wrote:


Richard,

Thanks for the comprehensive update.

We appreciate that there are tradeoffs and that dropping a table is a 
rarer operation than most others. The upside of the Sqlite design is 
that we treat a database as a single file which we copy around with 
ease. You cannot do that with many other databases, this has a 
downside though of when you need to drop a table.


A less charitable person would have stated that it was our own fault 
for designing the database and using it with approx 190M records in a 
single table with little thought about how we would manage it.  We now 
know this was a dumb idea, however you live and learn.


The impact of this is simply time. We've worked out how to shrink the 
database from 50GB to approx 1GB, after this we can prune the database 
on a weekly basis so this never becomes an issue.


I supposed the key question is would we give up the portability of 
sqlite for a fast table drop? The answer is No.


Rob

On 16 Jul 2018, at 22:59, Richard Hipp wrote:


On 7/16/18, Rob Willett  wrote:


It does look as if one of sqlite's weaknesses is dropping very, very
large tables.



Right.  If every table were stored in a separate file, a DROP TABLE
could be translated to a relatively fast unlink().  But then a
database would be a directory full of files, rather than a single
file, which would undermine a lot of the usefulness of SQLite.
Furthermore, DROP TABLE is an uncommon operation.  We prefer to
provide a database where all content is contained in a single file 
and

that is optimized SELECTs and for INSERTs and DELETEs of a subset of
the rows in the table, as that seems to be far more useful in most
cases.  There are always engineering trade-offs.  SQLite provides
single-file databases and fast queries in exchange for slower DROP
TABLEs.

A filesystem is able to implement unlink() quickly because it has the
entire partition available for laying out the locations of files.
Space can be allocated to a file in large chunks, which makes
deallocation faster.  In other words, in a filesystem, the files can
be positioned sparsely on disk, with lots of unused space in between
the various file to accommodate growth . But in SQLite, the goal is 
to

keep the database file size as small as possible, and to not have
unnecessary unused pages in the middle of the database.  Hence, space
for tables much be allocated in relatively small 1-page chunks, which
means that there will be a large number of chunks to deallocate when
dropping a large table.  If you were to construct a filesystem that
tried to keep all file content tightly packed at the beginning of a
partition (say, for example, to make shrinking of a partition faster)
then unlink() would necessarily be slower on that filesystem.  That
seems like a bad engineering trade-off for a filesystem, but it is
(for most applications) a good trade-off for a database such as
SQLite.

You can work around this.  If you have one or more tables that you
think might need to be DROP-ed frequently, then consider storing them
in a separate database files and ATTACH-ing those separate database
files to your main database.  Then, to drop those tables, you can
DETACH them, and then unlink() the corresponding database file.  That
should go much faster.

--
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://m

Re: [sqlite] Any ideas on how long it takes to drop a large table...rather an important question now...


Richard,

Thanks for the comprehensive update.

We appreciate that there are tradeoffs and that dropping a table is a 
rarer operation than most others. The upside of the Sqlite design is 
that we treat a database as a single file which we copy around with 
ease. You cannot do that with many other databases, this has a downside 
though of when you need to drop a table.


A less charitable person would have stated that it was our own fault for 
designing the database and using it with approx 190M records in a single 
table with little thought about how we would manage it.  We now know 
this was a dumb idea, however you live and learn.


The impact of this is simply time. We've worked out how to shrink the 
database from 50GB to approx 1GB, after this we can prune the database 
on a weekly basis so this never becomes an issue.


I supposed the key question is would we give up the portability of 
sqlite for a fast table drop? The answer is No.


Rob

On 16 Jul 2018, at 22:59, Richard Hipp wrote:


On 7/16/18, Rob Willett  wrote:


It does look as if one of sqlite's weaknesses is dropping very, very
large tables.



Right.  If every table were stored in a separate file, a DROP TABLE
could be translated to a relatively fast unlink().  But then a
database would be a directory full of files, rather than a single
file, which would undermine a lot of the usefulness of SQLite.
Furthermore, DROP TABLE is an uncommon operation.  We prefer to
provide a database where all content is contained in a single file and
that is optimized SELECTs and for INSERTs and DELETEs of a subset of
the rows in the table, as that seems to be far more useful in most
cases.  There are always engineering trade-offs.  SQLite provides
single-file databases and fast queries in exchange for slower DROP
TABLEs.

A filesystem is able to implement unlink() quickly because it has the
entire partition available for laying out the locations of files.
Space can be allocated to a file in large chunks, which makes
deallocation faster.  In other words, in a filesystem, the files can
be positioned sparsely on disk, with lots of unused space in between
the various file to accommodate growth . But in SQLite, the goal is to
keep the database file size as small as possible, and to not have
unnecessary unused pages in the middle of the database.  Hence, space
for tables much be allocated in relatively small 1-page chunks, which
means that there will be a large number of chunks to deallocate when
dropping a large table.  If you were to construct a filesystem that
tried to keep all file content tightly packed at the beginning of a
partition (say, for example, to make shrinking of a partition faster)
then unlink() would necessarily be slower on that filesystem.  That
seems like a bad engineering trade-off for a filesystem, but it is
(for most applications) a good trade-off for a database such as
SQLite.

You can work around this.  If you have one or more tables that you
think might need to be DROP-ed frequently, then consider storing them
in a separate database files and ATTACH-ing those separate database
files to your main database.  Then, to drop those tables, you can
DETACH them, and then unlink() the corresponding database file.  That
should go much faster.

--
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Any ideas on how long it takes to drop a large table...rather an important question now...


Jay,

I think your approach would have been quicker. Ten hours so far and it's 
still deleting the table :(


Rob

On 17 Jul 2018, at 2:16, Jay Kreibich wrote:


On Jul 16, 2018, at 3:01 PM, rob.sql...@robertwillett.com wrote:

Hi,

We're doing a massive tidy on our database which is approx 50GB.

One table is approx 49GB of that 50GB which we need to delete as we 
have recorded the information in a far better format thats approx 99% 
more efficient. If only we had been this clever when we started….



Depending on the complexity of the scheme, it sounds like it might be 
easier to just dump the other tables, re-create the database without 
this jumbo table, and re-import the other data.


 -j


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Any ideas on how long it takes to drop a large table...rather an important question now...

Mmm It's still taking an awful long time, though the -wal file is 
very small.


It does look as if one of sqlite's weaknesses is dropping very, very 
large tables.


Oh well, lets let it run through the night.

Rob

On 16 Jul 2018, at 21:25, Rob Willett wrote:


Dan,

We've killed the process (kill -9). Fired up sqlite3 again, closed it 
down normally. The -wal files were removed.


Fired up sqlite3 again, turned off secure_delete, started to drop the 
table again, reniced it down (long story to do with IOPS and our VPS 
provider)


-wal file is empty, the -shm file is a steady 32768 bytes.

No idea if anything is actually happening now, but at least we don't 
have a massive -wal file.


Suspect this could take the night (its 21:24 in London), so we'll 
detach the screen session and come back later.


Many thanks for the help.

Rob

On 16 Jul 2018, at 21:17, Dan Kennedy wrote:


On 07/17/2018 03:12 AM, Rob Willett wrote:

Dan,

Thanks for the reply.

pragma secure_delete;
1
sqlite>

Which is a bit of a surprise as we have never seen it before. We had 
to

look it up using your reference.

No idea why that is set (if 1 means it is).

Should we simply ctrl-C the deletion, turn secure_delete off and 
then
try again? My working assumption is that since there is a -wal file 
we

are safe to do this.


That sounds fine. Without secure-delete, the wal file should be 
pretty small.


You can always safely kill a process in the middle of an SQLite 
transaction. So long as you don't do anything foolhardy like deleting 
wal or journal files afterwards.


Dan.








Rob

On 16 Jul 2018, at 21:07, Dan Kennedy wrote:


On 07/17/2018 03:01 AM, rob.sql...@robertwillett.com wrote:

Hi,

We're doing a massive tidy on our database which is approx 50GB.

One table is approx 49GB of that 50GB which we need to delete as 
we have
recorded the information in a far better format thats approx 99% 
more

efficient. If only we had been this clever when we started

We've just 'dropped' the table and were assuming that dropping the 
table
would be quite quick. It's not. So far we've been waiting for 30 
mins
and nothing has come back yet. We can see that the -wal file is 
upto
2.5GB. We have this terrible feeling that it'll need to get to 
49GB or
so before the table gets dropped. We can just about handle that in 
the

current filesystem.

We're now getting nervous about dropping this table. We had 
assumed that
it would be a really quick and easy operation based on absolutely 
no

checking whatsoever. When we looked on line all we could see was a
reference to a very, very old and outdated page
(https://sqlite.org/speed.html) which talks about speed and at the
bottom of that page the comments

"SQLite is slower than the other databases when it comes to 
dropping
tables. This probably is because when SQLite drops a table, it has 
to go
through and erase the records in the database file that deal with 
that
table. MySQL and PostgreSQL, on the other hand, use separate files 
to
represent each table so they can drop a table simply by deleting a 
file,

which is much faster.

On the other hand, dropping tables is not a very common operation 
so if

SQLite takes a little longer, that is not seen as a big problem."

Is this still the case, is it going to take a long time? If we 
assume
that the table is 49GB then will we need to wait until the -wal 
file is

at 49GB. By our estimates thats approximately 10 hours away.

Any help or idea or suggestions welcomed, but please be quick.


Is secure-delete turned on?

  https://www.sqlite.org/pragma.html#pragma_secure_delete

Dan.

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Any ideas on how long it takes to drop a large table...rather an important question now...


Dan,

We've killed the process (kill -9). Fired up sqlite3 again, closed it 
down normally. The -wal files were removed.


Fired up sqlite3 again, turned off secure_delete, started to drop the 
table again, reniced it down (long story to do with IOPS and our VPS 
provider)


-wal file is empty, the -shm file is a steady 32768 bytes.

No idea if anything is actually happening now, but at least we don't 
have a massive -wal file.


Suspect this could take the night (its 21:24 in London), so we'll detach 
the screen session and come back later.


Many thanks for the help.

Rob

On 16 Jul 2018, at 21:17, Dan Kennedy wrote:


On 07/17/2018 03:12 AM, Rob Willett wrote:

Dan,

Thanks for the reply.

pragma secure_delete;
1
sqlite>

Which is a bit of a surprise as we have never seen it before. We had 
to

look it up using your reference.

No idea why that is set (if 1 means it is).

Should we simply ctrl-C the deletion, turn secure_delete off and then
try again? My working assumption is that since there is a -wal file 
we

are safe to do this.


That sounds fine. Without secure-delete, the wal file should be pretty 
small.


You can always safely kill a process in the middle of an SQLite 
transaction. So long as you don't do anything foolhardy like deleting 
wal or journal files afterwards.


Dan.








Rob

On 16 Jul 2018, at 21:07, Dan Kennedy wrote:


On 07/17/2018 03:01 AM, rob.sql...@robertwillett.com wrote:

Hi,

We're doing a massive tidy on our database which is approx 50GB.

One table is approx 49GB of that 50GB which we need to delete as we 
have
recorded the information in a far better format thats approx 99% 
more

efficient. If only we had been this clever when we started

We've just 'dropped' the table and were assuming that dropping the 
table
would be quite quick. It's not. So far we've been waiting for 30 
mins
and nothing has come back yet. We can see that the -wal file is 
upto
2.5GB. We have this terrible feeling that it'll need to get to 49GB 
or
so before the table gets dropped. We can just about handle that in 
the

current filesystem.

We're now getting nervous about dropping this table. We had assumed 
that
it would be a really quick and easy operation based on absolutely 
no

checking whatsoever. When we looked on line all we could see was a
reference to a very, very old and outdated page
(https://sqlite.org/speed.html) which talks about speed and at the
bottom of that page the comments

"SQLite is slower than the other databases when it comes to 
dropping
tables. This probably is because when SQLite drops a table, it has 
to go
through and erase the records in the database file that deal with 
that
table. MySQL and PostgreSQL, on the other hand, use separate files 
to
represent each table so they can drop a table simply by deleting a 
file,

which is much faster.

On the other hand, dropping tables is not a very common operation 
so if

SQLite takes a little longer, that is not seen as a big problem."

Is this still the case, is it going to take a long time? If we 
assume
that the table is 49GB then will we need to wait until the -wal 
file is

at 49GB. By our estimates thats approximately 10 hours away.

Any help or idea or suggestions welcomed, but please be quick.


Is secure-delete turned on?

  https://www.sqlite.org/pragma.html#pragma_secure_delete

Dan.

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Any ideas on how long it takes to drop a large table...rather an important question now...


Dan,

Thanks for the reply.

pragma secure_delete;
1
sqlite>

Which is a bit of a surprise as we have never seen it before. We had to 
look it up using your reference.


No idea why that is set (if 1 means it is).

Should we simply ctrl-C the deletion, turn secure_delete off and then 
try again? My working assumption is that since there is a -wal file we 
are safe to do this.


Rob

On 16 Jul 2018, at 21:07, Dan Kennedy wrote:


On 07/17/2018 03:01 AM, rob.sql...@robertwillett.com wrote:

Hi,

We're doing a massive tidy on our database which is approx 50GB.

One table is approx 49GB of that 50GB which we need to delete as we 
have

recorded the information in a far better format thats approx 99% more
efficient. If only we had been this clever when we started

We've just 'dropped' the table and were assuming that dropping the 
table

would be quite quick. It's not. So far we've been waiting for 30 mins
and nothing has come back yet. We can see that the -wal file is upto
2.5GB. We have this terrible feeling that it'll need to get to 49GB 
or
so before the table gets dropped. We can just about handle that in 
the

current filesystem.

We're now getting nervous about dropping this table. We had assumed 
that

it would be a really quick and easy operation based on absolutely no
checking whatsoever. When we looked on line all we could see was a
reference to a very, very old and outdated page
(https://sqlite.org/speed.html) which talks about speed and at the
bottom of that page the comments

"SQLite is slower than the other databases when it comes to dropping
tables. This probably is because when SQLite drops a table, it has to 
go
through and erase the records in the database file that deal with 
that

table. MySQL and PostgreSQL, on the other hand, use separate files to
represent each table so they can drop a table simply by deleting a 
file,

which is much faster.

On the other hand, dropping tables is not a very common operation so 
if

SQLite takes a little longer, that is not seen as a big problem."

Is this still the case, is it going to take a long time? If we assume
that the table is 49GB then will we need to wait until the -wal file 
is

at 49GB. By our estimates thats approximately 10 hours away.

Any help or idea or suggestions welcomed, but please be quick.


Is secure-delete turned on?

  https://www.sqlite.org/pragma.html#pragma_secure_delete

Dan.

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] Any ideas on how long it takes to drop a large table...rather an important question now...

2018-07-16 Thread rob . sqlite


Hi,

We're doing a massive tidy on our database which is approx 50GB.

One table is approx 49GB of that 50GB which we need to delete as we have 
recorded the information in a far better format thats approx 99% more 
efficient. If only we had been this clever when we started


We've just 'dropped' the table and were assuming that dropping the table 
would be quite quick. It's not. So far we've been waiting for 30 mins 
and nothing has come back yet. We can see that the -wal file is upto 
2.5GB. We have this terrible feeling that it'll need to get to 49GB or 
so before the table gets dropped. We can just about handle that in the 
current filesystem.


We're now getting nervous about dropping this table. We had assumed that 
it would be a really quick and easy operation based on absolutely no 
checking whatsoever. When we looked on line all we could see was a 
reference to a very, very old and outdated page 
(https://sqlite.org/speed.html) which talks about speed and at the 
bottom of that page the comments


"SQLite is slower than the other databases when it comes to dropping 
tables. This probably is because when SQLite drops a table, it has to go 
through and erase the records in the database file that deal with that 
table. MySQL and PostgreSQL, on the other hand, use separate files to 
represent each table so they can drop a table simply by deleting a file, 
which is much faster.


On the other hand, dropping tables is not a very common operation so if 
SQLite takes a little longer, that is not seen as a big problem."


Is this still the case, is it going to take a long time? If we assume 
that the table is 49GB then will we need to wait until the -wal file is 
at 49GB. By our estimates thats approximately 10 hours away.


Any help or idea or suggestions welcomed, but please be quick.

Rob
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] I'm getting a compiler error on a SQLIte build on Ubuntu 16.04 (which surprises me)


David,

Thanks for the reply.

We have the latest build-essential

root@preprod1:/jambuster/src/sqlite-src-324/tool# apt-get install 
build-essential

Reading package lists... Done
Building dependency tree
Reading state information... Done
build-essential is already the newest version (12.1ubuntu2).
build-essential set to manually installed.
The following package was automatically installed and is no longer 
required:

  libllvm5.0
Use 'apt autoremove' to remove it.
0 upgraded, 0 newly installed, 0 to remove and 0 not upgraded.

Indeed we have the same gcc version

gcc (Ubuntu 5.4.0-6ubuntu1~16.04.10) 5.4.0 20160609

It works without -O2, do you have optimisations left on?

Rob

On 10 Jul 2018, at 2:18, David Burgess wrote:


I'm using gcc (Ubuntu 5.4.0-6ubuntu1~16.04.10) 5.4.0 20160609

on 64 bit.

I have built both 64 and 32 bit versions from sqlite-src-324
No problems.

Check your build-essential install?

On Tue, Jul 10, 2018 at 1:05 AM, Dan Kennedy  
wrote:

On 07/09/2018 09:26 PM, Rob Willett wrote:


Dan,

Thanks for the information. We did try that first, failed and just 
assumed

we were idiots and went to the the other download.

Just downloaded it again, so we can check and be on the safe side, 
and we

get the same issue.

cp /jambuster/src/sqlite-src-324/ext/session/sqlite3session.h .
./libtool --mode=compile --tag=CC gcc -g -O2 -DSQLITE_OS_UNIX=1 -I.
-I/jambuster/src/sqlite-src-324/src
-I/jambuster/src/sqlite-src-324/ext/rtree
-I/jambuster/src/sqlite-src-324/ext/icu
-I/jambuster/src/sqlite-src-324/ext/fts3
-I/jambuster/src/sqlite-src-324/ext/async
-I/jambuster/src/sqlite-src-324/ext/session 
-D_HAVE_SQLITE_CONFIG_H

-DBUILD_sqlite -DNDEBUG -I/usr/include/tcl8.6 -DSQLITE_THREADSAFE=1
-DSQLITE_HAVE_ZLIB=1  -DSQLITE_TEMP_STORE=1 -c sqlite3.c
libtool: compile:  gcc -g -O2 -DSQLITE_OS_UNIX=1 -I.
-I/jambuster/src/sqlite-src-324/src
-I/jambuster/src/sqlite-src-324/ext/rtree
-I/jambuster/src/sqlite-src-324/ext/icu
-I/jambuster/src/sqlite-src-324/ext/fts3
-I/jambuster/src/sqlite-src-324/ext/async
-I/jambuster/src/sqlite-src-324/ext/session 
-D_HAVE_SQLITE_CONFIG_H

-DBUILD_sqlite -DNDEBUG -I/usr/include/tcl8.6 -DSQLITE_THREADSAFE=1
-DSQLITE_HAVE_ZLIB=1 -DSQLITE_TEMP_STORE=1 -c sqlite3.c  -fPIC -DPIC 
-o

.libs/sqlite3.o
gcc: internal compiler error: Killed (program cc1)
Please submit a full bug report,
with preprocessed source if appropriate.
See  for instructions.
Makefile:715: recipe for target 'sqlite3.lo' failed

./configure seems OK, tcl looks OK (which was our worry), we think 
there
is a genuine bug in the compiler. . We still need the 
sql_analyzer
program as well :) Its one of those days, I have bad compilers and 
the UK

government could fall.




After running [./configure], open the generated Makefile and remove 
all the

"-O2" switches you can find. Then run [make sqlite3_analyzer].

You'll get an un-optimized binary of course, but there's a good 
chance this
will bypass the compiler bug. And it won't make much difference 
unless your

databases are really big.

Dan.









Rob

On 9 Jul 2018, at 15:02, Dan Kennedy wrote:


On 07/09/2018 03:56 PM, rob.sql...@robertwillett.com wrote:


Hi

I'm trying to build the SQLite packages from the source. I 
**only** want
it to get a copy of sqlite_analyse which for some reason doesn't 
appear to
be easily available as a compiled option. The download package 
doesn't work
on Ubuntu 16.04 and as far as we can see, there are no other 
downloadable
binaries that don't come from China. For some reason, I don't 
really want to

download something from Beijing



The Linux binaries on sqlite.org are for 32-bit systems. For them 
to work

on Ubuntu 16.04, you probably need to install multiarch support.

Or, to build sqlite3_analyzer from source you need the full source
distribution:

  https://www.sqlite.org/2018/sqlite-src-324.zip

Dan.





The full steps I followed are at the very bottom of this e-mail. 
They

are a direct copy from my terminal output.

**TLDR Summary**

1. OS is a reasonably fresh (< 4 weeks old) Ubuntu 16.04 LTS 
Server
running under OpenVZ on a commercial ISP. It's patched to the 
current
levels. I don't think this is a problem as we don't have any other 
issues

but...

2. SQLite3 is already installed as the normal (and older) SQLIte 
that is

distributed with Ubuntu 16.04 LTS

3. gcc is installed. I have created a small C program to test that 
it

can compile (Hello World) and it's fine. (gcc (Ubuntu
5.4.0-6ubuntu1~16.04.10) 5.4.0 20160609)

4. Downloaded 
https://www.sqlite.org/2018/sqlite-autoconf-324.tar.gz

through wget.

5. Checked the sha1sum and it matches the sqlite page.

6. Untarred the file into it's own directory

7. cd'ed to directory and ran ./configure.

8. Checked output of configure and nothing seemed wrong. Details 
are

below.

9.

Re: [sqlite] I'm getting a compiler error on a SQLIte build on Ubuntu 16.04 (which surprises me)


Dan,

Many thanks for this. I removed two instances of -O2 from the Makefile 
and, lo and behold, it compiles.


Sadly my database to check is approx 80GB which could be interesting. 
However I'll take a working (but slower) program over a faster (but not 
working) application every time :)


Rob

On 9 Jul 2018, at 16:05, Dan Kennedy wrote:


On 07/09/2018 09:26 PM, Rob Willett wrote:

Dan,

Thanks for the information. We did try that first, failed and just 
assumed we were idiots and went to the the other download.


Just downloaded it again, so we can check and be on the safe side, 
and we get the same issue.


cp /jambuster/src/sqlite-src-324/ext/session/sqlite3session.h .
./libtool --mode=compile --tag=CC gcc -g -O2 -DSQLITE_OS_UNIX=1 -I. 
-I/jambuster/src/sqlite-src-324/src 
-I/jambuster/src/sqlite-src-324/ext/rtree 
-I/jambuster/src/sqlite-src-324/ext/icu 
-I/jambuster/src/sqlite-src-324/ext/fts3 
-I/jambuster/src/sqlite-src-324/ext/async 
-I/jambuster/src/sqlite-src-324/ext/session 
-D_HAVE_SQLITE_CONFIG_H -DBUILD_sqlite -DNDEBUG -I/usr/include/tcl8.6 
-DSQLITE_THREADSAFE=1 -DSQLITE_HAVE_ZLIB=1  -DSQLITE_TEMP_STORE=1 -c 
sqlite3.c
libtool: compile:  gcc -g -O2 -DSQLITE_OS_UNIX=1 -I. 
-I/jambuster/src/sqlite-src-324/src 
-I/jambuster/src/sqlite-src-324/ext/rtree 
-I/jambuster/src/sqlite-src-324/ext/icu 
-I/jambuster/src/sqlite-src-324/ext/fts3 
-I/jambuster/src/sqlite-src-324/ext/async 
-I/jambuster/src/sqlite-src-324/ext/session 
-D_HAVE_SQLITE_CONFIG_H -DBUILD_sqlite -DNDEBUG -I/usr/include/tcl8.6 
-DSQLITE_THREADSAFE=1 -DSQLITE_HAVE_ZLIB=1 -DSQLITE_TEMP_STORE=1 -c 
sqlite3.c  -fPIC -DPIC -o .libs/sqlite3.o

gcc: internal compiler error: Killed (program cc1)
Please submit a full bug report,
with preprocessed source if appropriate.
See  for instructions.
Makefile:715: recipe for target 'sqlite3.lo' failed

./configure seems OK, tcl looks OK (which was our worry), we think 
there is a genuine bug in the compiler. . We still need the 
sql_analyzer program as well :) Its one of those days, I have bad 
compilers and the UK government could fall.



After running [./configure], open the generated Makefile and remove 
all the "-O2" switches you can find. Then run [make sqlite3_analyzer].


You'll get an un-optimized binary of course, but there's a good chance 
this will bypass the compiler bug. And it won't make much difference 
unless your databases are really big.


Dan.








Rob

On 9 Jul 2018, at 15:02, Dan Kennedy wrote:


On 07/09/2018 03:56 PM, rob.sql...@robertwillett.com wrote:

Hi

I'm trying to build the SQLite packages from the source. I **only** 
want it to get a copy of sqlite_analyse which for some reason 
doesn't appear to be easily available as a compiled option. The 
download package doesn't work on Ubuntu 16.04 and as far as we can 
see, there are no other downloadable binaries that don't come from 
China. For some reason, I don't really want to download something 
from Beijing


The Linux binaries on sqlite.org are for 32-bit systems. For them to 
work on Ubuntu 16.04, you probably need to install multiarch 
support.


Or, to build sqlite3_analyzer from source you need the full source 
distribution:


  https://www.sqlite.org/2018/sqlite-src-324.zip

Dan.





The full steps I followed are at the very bottom of this e-mail. 
They are a direct copy from my terminal output.


**TLDR Summary**

1. OS is a reasonably fresh (< 4 weeks old) Ubuntu 16.04 LTS Server 
running under OpenVZ on a commercial ISP. It's patched to the 
current levels. I don't think this is a problem as we don't have 
any other issues but...


2. SQLite3 is already installed as the normal (and older) SQLIte 
that is distributed with Ubuntu 16.04 LTS


3. gcc is installed. I have created a small C program to test that 
it can compile (Hello World) and it's fine. (gcc (Ubuntu 
5.4.0-6ubuntu1~16.04.10) 5.4.0 20160609)


4. Downloaded 
https://www.sqlite.org/2018/sqlite-autoconf-324.tar.gz through 
wget.


5. Checked the sha1sum and it matches the sqlite page.

6. Untarred the file into it's own directory

7. cd'ed to directory and ran ./configure.

8. Checked output of configure and nothing seemed wrong. Details 
are below.


9. Ran make and get a compiler failure!  (see below). I'm 
gobsmacked that the compiler has failed to be honest. So my first 
assumption is that we have cocked up something. However we've 
googled and checked and can't see anybody with a similar issue. I 
suspect that one of the Define statements is not playing nicely but 
not wholly sure which one OR we are missing a package that needs to 
be installed, e.g. TCL x,y or z.


root@preprod1:/jambuster/src/sqlite/sqlite-autoconf-324# make
/bin/bash ./libtool  --tag=CC   --mode=compile gcc 
-DPACKAGE_NAME=\"sqlite\" -DPACKAGE_TARNAME=\"sqlite\&

Re: [sqlite] I'm getting a compiler error on a SQLIte build on Ubuntu 16.04 (which surprises me)


Dan,

Thanks for the information. We did try that first, failed and just 
assumed we were idiots and went to the the other download.


Just downloaded it again, so we can check and be on the safe side, and 
we get the same issue.


cp /jambuster/src/sqlite-src-324/ext/session/sqlite3session.h .
./libtool --mode=compile --tag=CC gcc -g -O2 -DSQLITE_OS_UNIX=1 -I. 
-I/jambuster/src/sqlite-src-324/src 
-I/jambuster/src/sqlite-src-324/ext/rtree 
-I/jambuster/src/sqlite-src-324/ext/icu 
-I/jambuster/src/sqlite-src-324/ext/fts3 
-I/jambuster/src/sqlite-src-324/ext/async 
-I/jambuster/src/sqlite-src-324/ext/session -D_HAVE_SQLITE_CONFIG_H 
-DBUILD_sqlite -DNDEBUG -I/usr/include/tcl8.6 -DSQLITE_THREADSAFE=1   
-DSQLITE_HAVE_ZLIB=1  -DSQLITE_TEMP_STORE=1 -c sqlite3.c
libtool: compile:  gcc -g -O2 -DSQLITE_OS_UNIX=1 -I. 
-I/jambuster/src/sqlite-src-324/src 
-I/jambuster/src/sqlite-src-324/ext/rtree 
-I/jambuster/src/sqlite-src-324/ext/icu 
-I/jambuster/src/sqlite-src-324/ext/fts3 
-I/jambuster/src/sqlite-src-324/ext/async 
-I/jambuster/src/sqlite-src-324/ext/session -D_HAVE_SQLITE_CONFIG_H 
-DBUILD_sqlite -DNDEBUG -I/usr/include/tcl8.6 -DSQLITE_THREADSAFE=1 
-DSQLITE_HAVE_ZLIB=1 -DSQLITE_TEMP_STORE=1 -c sqlite3.c  -fPIC -DPIC -o 
.libs/sqlite3.o

gcc: internal compiler error: Killed (program cc1)
Please submit a full bug report,
with preprocessed source if appropriate.
See  for instructions.
Makefile:715: recipe for target 'sqlite3.lo' failed

./configure seems OK, tcl looks OK (which was our worry), we think there 
is a genuine bug in the compiler. . We still need the sql_analyzer 
program as well :) Its one of those days, I have bad compilers and the 
UK government could fall.


Rob

On 9 Jul 2018, at 15:02, Dan Kennedy wrote:


On 07/09/2018 03:56 PM, rob.sql...@robertwillett.com wrote:

Hi

I'm trying to build the SQLite packages from the source. I **only** 
want it to get a copy of sqlite_analyse which for some reason doesn't 
appear to be easily available as a compiled option. The download 
package doesn't work on Ubuntu 16.04 and as far as we can see, there 
are no other downloadable binaries that don't come from China. For 
some reason, I don't really want to download something from 
Beijing


The Linux binaries on sqlite.org are for 32-bit systems. For them to 
work on Ubuntu 16.04, you probably need to install multiarch support.


Or, to build sqlite3_analyzer from source you need the full source 
distribution:


  https://www.sqlite.org/2018/sqlite-src-324.zip

Dan.





The full steps I followed are at the very bottom of this e-mail. They 
are a direct copy from my terminal output.


**TLDR Summary**

1. OS is a reasonably fresh (< 4 weeks old) Ubuntu 16.04 LTS Server 
running under OpenVZ on a commercial ISP. It's patched to the current 
levels. I don't think this is a problem as we don't have any other 
issues but...


2. SQLite3 is already installed as the normal (and older) SQLIte that 
is distributed with Ubuntu 16.04 LTS


3. gcc is installed. I have created a small C program to test that it 
can compile (Hello World) and it's fine. (gcc (Ubuntu 
5.4.0-6ubuntu1~16.04.10) 5.4.0 20160609)


4. Downloaded 
https://www.sqlite.org/2018/sqlite-autoconf-324.tar.gz through 
wget.


5. Checked the sha1sum and it matches the sqlite page.

6. Untarred the file into it's own directory

7. cd'ed to directory and ran ./configure.

8. Checked output of configure and nothing seemed wrong. Details are 
below.


9. Ran make and get a compiler failure!  (see below). I'm gobsmacked 
that the compiler has failed to be honest. So my first assumption is 
that we have cocked up something. However we've googled and checked 
and can't see anybody with a similar issue. I suspect that one of the 
Define statements is not playing nicely but not wholly sure which one 
OR we are missing a package that needs to be installed, e.g. TCL x,y 
or z.


root@preprod1:/jambuster/src/sqlite/sqlite-autoconf-324# make
/bin/bash ./libtool  --tag=CC   --mode=compile gcc 
-DPACKAGE_NAME=\"sqlite\" -DPACKAGE_TARNAME=\"sqlite\" 
-DPACKAGE_VERSION=\"3.24.0\" -DPACKAGE_STRING=\"sqlite\ 3.24.0\" 
-DPACKAGE_BUGREPORT=\"http://www.sqlite.org\"; -DPACKAGE_URL=\"\" 
-DPACKAGE=\"sqlite\" -DVERSION=\"3.24.0\" -DSTDC_HEADERS=1 
-DHAVE_SYS_TYPES_H=1 -DHAVE_SYS_STAT_H=1 -DHAVE_STDLIB_H=1 
-DHAVE_STRING_H=1 -DHAVE_MEMORY_H=1 -DHAVE_STRINGS_H=1 
-DHAVE_INTTYPES_H=1 -DHAVE_STDINT_H=1 -DHAVE_UNISTD_H=1 
-DHAVE_DLFCN_H=1 -DLT_OBJDIR=\".libs/\" -DHAVE_FDATASYNC=1 
-DHAVE_USLEEP=1 -DHAVE_LOCALTIME_R=1 -DHAVE_GMTIME_R=1 
-DHAVE_DECL_STRERROR_R=1 -DHAVE_STRERROR_R=1 -DHAVE_POSIX_FALLOCATE=1 
-DHAVE_ZLIB_H=1 -I.-D_REENTRANT=1 -DSQLITE_THREADSAFE=1  
-DSQLITE_ENABLE_FTS5 -DSQLITE_ENABLE_JSON1 -DSQLITE_HAVE_ZLIB

Re: [sqlite] I'm getting a compiler error on a SQLIte build on Ubuntu 16.04 (which surprises me)


Clemens,

Thanks. We were up to date anyway, but we did check and we still get the 
same error.


It does look like a compiler bug, also nobody else seems to have 
reported it, which I think is very odd. I cannot be the first person to 
compile this version of SQLite on Ubuntu 16.04 server.


We'll work out how to report it. I still have the same problem of trying 
to get hold of a copy of SQLite_analyze though :( Which is what I wanted 
to do in the first place.


Rob

On 9 Jul 2018, at 11:24, Clemens Ladisch wrote:


rob.sql...@robertwillett.com wrote:

gcc: internal compiler error: Killed (program cc1)


This is a compiler bug.  Check if updating gcc to a current version 
helps.



Please submit a full bug report,
with preprocessed source if appropriate.
See  for instructions.


Otherwise, do this.


Regards,
Clemens
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] I'm getting a compiler error on a SQLIte build on Ubuntu 16.04 (which surprises me)

2018-07-09 Thread rob . sqlite


Hi

I'm trying to build the SQLite packages from the source. I **only** want 
it to get a copy of sqlite_analyse which for some reason doesn't appear 
to be easily available as a compiled option. The download package 
doesn't work on Ubuntu 16.04 and as far as we can see, there are no 
other downloadable binaries that don't come from China. For some reason, 
I don't really want to download something from Beijing


The full steps I followed are at the very bottom of this e-mail. They 
are a direct copy from my terminal output.


**TLDR Summary**

1. OS is a reasonably fresh (< 4 weeks old) Ubuntu 16.04 LTS Server 
running under OpenVZ on a commercial ISP. It's patched to the current 
levels. I don't think this is a problem as we don't have any other 
issues but...


2. SQLite3 is already installed as the normal (and older) SQLIte that is 
distributed with Ubuntu 16.04 LTS


3. gcc is installed. I have created a small C program to test that it 
can compile (Hello World) and it's fine. (gcc (Ubuntu 
5.4.0-6ubuntu1~16.04.10) 5.4.0 20160609)


4. Downloaded https://www.sqlite.org/2018/sqlite-autoconf-324.tar.gz 
through wget.


5. Checked the sha1sum and it matches the sqlite page.

6. Untarred the file into it's own directory

7. cd'ed to directory and ran ./configure.

8. Checked output of configure and nothing seemed wrong. Details are 
below.


9. Ran make and get a compiler failure!  (see below). I'm gobsmacked 
that the compiler has failed to be honest. So my first assumption is 
that we have cocked up something. However we've googled and checked and 
can't see anybody with a similar issue. I suspect that one of the Define 
statements is not playing nicely but not wholly sure which one OR we are 
missing a package that needs to be installed, e.g. TCL x,y or z.


root@preprod1:/jambuster/src/sqlite/sqlite-autoconf-324# make
/bin/bash ./libtool  --tag=CC   --mode=compile gcc 
-DPACKAGE_NAME=\"sqlite\" -DPACKAGE_TARNAME=\"sqlite\" 
-DPACKAGE_VERSION=\"3.24.0\" -DPACKAGE_STRING=\"sqlite\ 3.24.0\" 
-DPACKAGE_BUGREPORT=\"http://www.sqlite.org\"; -DPACKAGE_URL=\"\" 
-DPACKAGE=\"sqlite\" -DVERSION=\"3.24.0\" -DSTDC_HEADERS=1 
-DHAVE_SYS_TYPES_H=1 -DHAVE_SYS_STAT_H=1 -DHAVE_STDLIB_H=1 
-DHAVE_STRING_H=1 -DHAVE_MEMORY_H=1 -DHAVE_STRINGS_H=1 
-DHAVE_INTTYPES_H=1 -DHAVE_STDINT_H=1 -DHAVE_UNISTD_H=1 -DHAVE_DLFCN_H=1 
-DLT_OBJDIR=\".libs/\" -DHAVE_FDATASYNC=1 -DHAVE_USLEEP=1 
-DHAVE_LOCALTIME_R=1 -DHAVE_GMTIME_R=1 -DHAVE_DECL_STRERROR_R=1 
-DHAVE_STRERROR_R=1 -DHAVE_POSIX_FALLOCATE=1 -DHAVE_ZLIB_H=1 -I.
-D_REENTRANT=1 -DSQLITE_THREADSAFE=1  -DSQLITE_ENABLE_FTS5 
-DSQLITE_ENABLE_JSON1 -DSQLITE_HAVE_ZLIB  -DSQLITE_ENABLE_FTS3 
-DSQLITE_ENABLE_RTREE  -g -O2 -MT sqlite3.lo -MD -MP -MF 
.deps/sqlite3.Tpo -c -o sqlite3.lo sqlite3.c
libtool: compile:  gcc -DPACKAGE_NAME=\"sqlite\" 
-DPACKAGE_TARNAME=\"sqlite\" -DPACKAGE_VERSION=\"3.24.0\" 
"-DPACKAGE_STRING=\"sqlite 3.24.0\"" 
-DPACKAGE_BUGREPORT=\"http://www.sqlite.org\"; -DPACKAGE_URL=\"\" 
-DPACKAGE=\"sqlite\" -DVERSION=\"3.24.0\" -DSTDC_HEADERS=1 
-DHAVE_SYS_TYPES_H=1 -DHAVE_SYS_STAT_H=1 -DHAVE_STDLIB_H=1 
-DHAVE_STRING_H=1 -DHAVE_MEMORY_H=1 -DHAVE_STRINGS_H=1 
-DHAVE_INTTYPES_H=1 -DHAVE_STDINT_H=1 -DHAVE_UNISTD_H=1 -DHAVE_DLFCN_H=1 
-DLT_OBJDIR=\".libs/\" -DHAVE_FDATASYNC=1 -DHAVE_USLEEP=1 
-DHAVE_LOCALTIME_R=1 -DHAVE_GMTIME_R=1 -DHAVE_DECL_STRERROR_R=1 
-DHAVE_STRERROR_R=1 -DHAVE_POSIX_FALLOCATE=1 -DHAVE_ZLIB_H=1 -I. 
-D_REENTRANT=1 -DSQLITE_THREADSAFE=1 -DSQLITE_ENABLE_FTS5 
-DSQLITE_ENABLE_JSON1 -DSQLITE_HAVE_ZLIB -DSQLITE_ENABLE_FTS3 
-DSQLITE_ENABLE_RTREE -g -O2 -MT sqlite3.lo -MD -MP -MF 
.deps/sqlite3.Tpo -c sqlite3.c  -fPIC -DPIC -o .libs/sqlite3.o

gcc: internal compiler error: Killed (program cc1)
Please submit a full bug report,
with preprocessed source if appropriate.
See  for instructions.
Makefile:539: recipe for target 'sqlite3.lo' failed
make: *** [sqlite3.lo] Error 1

10. I checked the Makefile but there's no obvious issue

11. We've done this three times now with the same result.

12. The fact the compiler barfs is worrying, we know how many people use 
SQLite so for this sort of error to occur is a little surprising and we 
still thinksqlite-us...@mailinglists.sqlite.org its our fault, but we 
cannot see what we have done wrong.


Any help or suggestions welcomed. I have to say I am sick to the back 
teeth of reading websites that purport to tell you how to compile SQLite 
to find that it simply consists of apt-get install sqlite3 (or 
whatever). The signal to noise ration here is very, very low. We need a 
clickbait filter.


Thanks

Rob
-


**Detailed summary**

wget https://www.sqlite.org/2018/sqlite-autoconf-324

Re: [sqlite] Mailing list shutting down...

2018-06-13 Thread rob . sqlite


Richard,

We use Discourse (as a user) to get support for the Mail in a Box 
system. MIAB use Discourse for their support systems. I also think 
Discourse is used for the Ionic support pages as well as they have very 
similar looking interfaces. Until now I paid little attention to them.


I was going to say that I wasn't too impressed with it as a system, then 
I thought again and realised that it actually works pretty well and 
doesn't get in your way too much. That's a pretty good compliment as the 
software isn't in your face all the time telling you how nice it is, 
anybody used Slack recently :) We use it quite a lot and in hindsight it 
works well.


We've never spun a Discourse system up, but I have some spare time this 
evening and might just put one on a VMWare ESXI server and see how it 
looks.


I may be older than Dr Hipp as I can recall running Unix on a 64KB (yep 
KB) box in the 80's, so am very familiar with maximising resources, but 
I go the other way now and and run dedicated (but small and self 
contained) boxes that are very focused and don't try to cram as much 
into a single box/instance/VM as possible. I don't care about the fact 
I'm running 30 small Linux boxes on my single ESXI server as I can spin 
them up and most of the time they don't do anything.


I accept the issues over maintenance though, but I have a standard set 
of instructions I follow to harden the boxes and restrict logins with 
things like fail2ban. From start to finish I can have a hardened Ubuntu 
box up in around 20-30 minutes.  Very happy to share these instructions 
as somebody may say they are rubbish and can provide better hardening 
instructions.


I'm UK based, but happy to help, setting this sort of stuff up is 
something I can do and have regularly done (but NOT for Discourse), 
anyway I'm better at this than SQL :)


It's currently 20:30 UK time, can help, other people have helped me 
enough on this forum, so I feel I can contribute something back.


Thoughts on what needs to be done:

1. Setup the VMware instance correctly based on the Discourse info.

2. Provide some sort of access via ssh, passwords or whatever.

3. Details of IP addresses.

4. Firewall configure, Its not clear if these VM's are behind other 
firewalls and what the access rights are, e.g. you have https.


5. What's the SSL situation. We've just moved from RapidSSL to 
LetsEncrypt as a) They are free b) They self renew c) They weren't going 
to be blacklisted by Google as they were really Symantec certificates.


6. Does the installation need root access?

7. Postfix information, e.g. is it a satellite, a relay etc etc. One 
wrong move here and we get the IP address and domain name banned. Did 
that for our domain whilst setting up MIAB.


8. Installation of Discourse.

9. How do multiple people work together on the same box? Slack? Skype? 
Shouting loudly


10. Documenting the build?

11. How to test the build? Testers needed and a test plan needs to be 
put together. SQLite has an excellent reputation, this shouldn't sully 
it.


12. Profit?

Just my 2p worth,

Rob

On 13 Jun 2018, at 19:59, Richard Hipp wrote:


Cross-posted to the fossil-users mailing list since www.fossil-scm.org
and www.sqlite.org are the same machine and both mailing lists are
impacted by the current problem.

On 6/13/18, Luiz Américo  wrote:

How about using https://www.discourse.org/ ?

Open source projects can use for free


Thanks for the pointer, Luiz.

Discourse is moving the right direction, I think.  To install it, one
downloads a docker container and runs it on some Linux VM someplace.
(They recommend Digital Ocean, which is where I www3.sqlite.org is
hosted already.)  It's a self-contained package with minimal
dependencies that just works.  And it uses SQLite!  My kind of
software!

Here are my remaining points of heartburn with Discourse:

(1) The installation guide recommends using an external email service,
and they even recommend four appropriate services.  I clicked through
to each one, having never heard of any of them before.  All four are
pushing email marketing for companies sending 10 million or more
emails per month.  It seems to me that aggressive email marketing is
the root cause of my problem in the first place, so I am somewhat
reluctant to engage a marketing firm to help with the solution.
Fortunately, Discourse also allows one to use a self-hosting Postfix
installation, which is what we are currently running on sqlite.org.

(2) Discourse seems to want to run on a machine all by itself.  (It is
written in Rails and has its own webserver.)  I suppose I could spin
up yet another VM to do that.  But I learned this craft in an age
where machines were big and expensive and the goal was to cram as many
services as you could fit onto a single machine and IP address, and so
spinning up a separate machine with its own domain name just to manage
t

Re: [sqlite] Is this really the best way to do this?

2018-05-16 Thread Rob Willett


Clemens, Paul S, Abroży,

Thank you for taking the time to reply. I'll look at each of them and 
see what makes the most sense to me.


It's always good to ask the people here as there is a wealth of 
experience.


As suggestions popped in, I kept thinking go the unix competition which 
was to list all the ways to accomplish printing something out, there's 
always another to do it. I will learn here as well.


Many thanks to everybody,

Rob

On 16 May 2018, at 11:25, Clemens Ladisch wrote:


Rob Willett wrote:

select CASE
WHEN EXISTS (select 1 from blocked where email = 
'rwillett.dr...@example.com')
THEN (select action from blocked where email = 
'rwillett.dr...@example.com')

ELSE 'OK'
END


SELECT action FROM blocked WHERE email = ?
UNION ALL
SELECT 'OK'
LIMIT 1;


Regards,
Clemens
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] Is this really the best way to do this?

2018-05-16 Thread Rob Willett


Hi,

I'm experimenting with an email server, Mail In a Box. It's a free 
product and gives you a mail server in a box with SSL certificates, 
multiple domains and seems to work well.


One thing it doesn't do is allow people to specify emails to block.

It uses SQLite to store the underlying data necessary to power Postfix. 
Dr Richard Hipp, add another new application to your list :)


I've worked out how Postfix calls SQLite (pretty easy), worked out what 
I want to do, and am trying to write a single line of SQL that does it.


The need is that the SQL must generate a single string answer depending 
on the email address that is fed in through the query. There are no 
options here, it must return a value even if there is nothing in the 
table. It is not practical to add a table with every email address that 
returns OK.


For this example the only strings it can return are 'OK' and 'DISCARD', 
though the RFC allows other strings. If a query is done on an email and 
it is blocked then it must return DISCARD (or an action in the action 
column. If no email is in the table matching then it must return 'OK'.


As a piece of pseudo code this would ne

function GetMailStatus(emailAddress)
{
IF emailAddress is present in blocked THEN
		return action associated with emailAddress -- Action is normally 
DISCARD


 return 'OK'
}

I've created the table

CREATE TABLE blocked (id INTEGER PRIMARY KEY AUTOINCREMENT,
   email TEXT NOT NULL UNIQUE ,
  action TEXT NOT NULL DEFAULT 'DISCARD')

I can insert values

insert into blocked (email) values('rwillett.dr...@example.com')
insert into blocked (email) values('rwillett+dr...@example.com')

And this is the code that I have written that works but it looks poor to 
my untutored eyes


select CASE
WHEN EXISTS (select 1 from blocked where email = 
'rwillett.dr...@example.com')
THEN (select action from blocked where email = 
'rwillett.dr...@example.com')

ELSE 'OK'
END


In the Postfix query I'll replace

email = 'rwillett.dr...@example.com'

with email = '%s'

so that the email address is passed in.

My worry is that the query has two selects and 'feels' bad. It works but 
feels dirty...


Is there a better way under SQLite?

Any suggestions welcomed.

Rob
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] About storage of large amounts of image data

2018-05-08 Thread Rob Willett


Mike,

We use SQLite to store BLOB data to do with polygons and R-tree's for 
mapping traffic data. We're running 70-90GB (not looked recently) and 
the size has never been an issue for us (which is why we've not looked).


However binary images might start to get towards the field size limit 
which is defined as 10^9 (from http://www.sqlite.org/limits.html). Now 
1GB might be considered large at the moment, but as image files, 
especially RAW images, creep up due to sensor size, this may be too 
small in a few years time. I know that some people are playing with 
multi gigabyte images from very, very hi-res cameras and stitching stuff 
together, which would get close to this. However this is a compile time 
option so you could move it from 1GB to 10GB and unless you are into 
serious astronomy that *might* be enough for a while.


I'd be surprised if the images you have are going to seriously stress 
SQLite to be honest. Now if you are looking for a Lightroom replacement, 
that would be an interesting project and I'd love to hear more :)


SQLite advertises itself as a file system replacement (for some uses). 
This could well be one of the use cases.


The thing I would consider is how to backup your multi-terabyte image 
database. Might take some time


All the best,

Rob


On 8 May 2018, at 14:08, Mike Clark wrote:


 Hi List!

I'm developing a project that deals with image files and am 
considering
storing all the images in the SQLite database itself, rather than (or 
in
addition to) the file system.  Since the prospective users will 
probably be

dealing with hundreds of gigabytes in their use of the project, I am
wondering if this is an effective or efficient use of SQLite -- or 
safe,

because of the risk of data corruption.

I know the documentation says that SQLite can handle up to 140 TB (do 
we
know of anyone who is doing this?), so hundreds of gigs is clearly 
doable.


Is it advisable, however?

(Sent this about 2 wks ago, but it never posted to the list and no
moderator response either).

--
Mike Clark
Twitter: @Cyberherbalist
Blog: Cyberherbalist's Blog <http://www.cyberherbalist.co>
-
"Free will, though it makes evil possible, is also the only thing that
makes possible any love or goodness or joy worth having."
*- C. S. Lewis*
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] How many AUTOINCREMENT tables are in your schema?

2018-03-17 Thread Rob Willett


We have 17 in one database and 1 in another.

Rob

On 17 Mar 2018, at 11:43, x wrote:


0 in my first sqlite3 DB




From: sqlite-users  on 
behalf of Jean-Christophe Deschamps 

Sent: Saturday, March 17, 2018 9:04:22 AM
To: SQLite mailing list
Subject: Re: [sqlite] How many AUTOINCREMENT tables are in your 
schema?




How many tables in your schema(s) use AUTOINCREMENT?


Maybe a dozen uses in DB designs I made for my own use. Zero or very
few in designs made for others.

My use case of autoincrement is certainly somehow peculiar.

For some of DBs I use daily for managing my own business (I'm
self-employed) I didn't feel the need to write ad-hoc applications and
I only use a third-party SQLite manager (SQlite Expert is open 24/7).
Of course such use demands real care and a lot of fancy constraints,
triggers, add-on functions, external procedures, etc to be workable.
This isn't scalable nor usable by anyone else.

Several of my tables are best viewed/edited as LIFOs: the more recent
entries on top. For instance a table of orders, where you prefer 
recent

entries to be on top of the table when viewed by "natural" (ID) order.
To achieve that effect I use autoincrement and triggers which negate
the rowid alias at insert. The sqlite-sequence entry is handy to make
new row get an unused ID which, once negated, will show first when
viewing the table, albeit there is no more any non-negative ID in the
actual table.

I wouldn't have the use of autoincrement if my DB manager had a
settable ORDER BY clause for basic table viewing/editing.


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Preventing a slow, determinstic function from running twice

2018-01-05 Thread Rob Hoelz

On Fri, 5 Jan 2018 14:42:00 -0600
Rob Hoelz  wrote:

> On Fri, 5 Jan 2018 15:04:16 -0500
> Richard Hipp  wrote:
> 
> > On 1/5/18, r...@hoelz.ro  wrote:
> > > Hi SQLite users and devs,
> > >
> > > I have an application using SQLite which stores XZ-compressed
> > > blobs of JSON from the Twitter API to minimize disk usage.  My
> > > unxz function is a little slow, and I've noticed that if I
> > > specify the function several times in a query, it gets run
> > > multiple times, even though it's deterministic.  For example:
> > >
> > 
> > Your example did not go through.  Please resend.  Use plain-text
> > email for best results.
> > 
> > >
> > >
> > > In the above query, unxz is run three times, even though content
> > > doesn't change within the same row.  Is there a way to tell SQLite
> > > to only run a function once per row of results?  I looked into
> > > https://sqlite.org/c3ref/get_auxdata.html, but it appears that
> > > metadata is only stored for compile-time constants.
> > >
> > > Thanks,
> > > Rob
> > >
> > >
> > >
> > >
> > > --
> > > Sent from: http://sqlite.1065341.n5.nabble.com/
> > > ___
> > > sqlite-users mailing list
> > > sqlite-users@mailinglists.sqlite.org
> > > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
> > >
> > 
> > 
> 
> 
> Here's the example:
> 
> > select json_ref(unxz(content), '$.user.id'), json_ref(unxz(content),
> > '$.full_text) from archive where json_ref(unxz(content),
> > '$.full_text') like '%dogs%';
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Sorry - I just noticed that I wrote "json_ref" where I meant to write 
"json_extract".
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Preventing a slow, determinstic function from running twice

2018-01-05 Thread Rob Hoelz

On Fri, 5 Jan 2018 12:02:21 -0800
Jens Alfke  wrote:

> > On Jan 5, 2018, at 11:25 AM, r...@hoelz.ro wrote:
> > 
> > … My unxz function is a little slow, and I've noticed that if I
> > specify the function several times in a query, it gets run multiple
> > times, even though it's deterministic. … Is there a way to tell
> > SQLite to only run a function once per row of results?
> 
> Not currently. There was a long-ish email thread about that here last
> September, with subject "Common subexpression optimization of
> deterministic functions”. I don’t think there was any resolution,
> just that it’s an optimization that hasn’t been implemented yet.
> 
> —Jens
> 
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Thanks for letting me know about the thread, Jens - I did a cursory search of 
the archives but I must've been using the wrong keywords!  I'll read up on that.

-Rob
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Preventing a slow, determinstic function from running twice

2018-01-05 Thread Rob Hoelz

On Fri, 5 Jan 2018 15:04:16 -0500
Richard Hipp  wrote:

> On 1/5/18, r...@hoelz.ro  wrote:
> > Hi SQLite users and devs,
> >
> > I have an application using SQLite which stores XZ-compressed blobs
> > of JSON from the Twitter API to minimize disk usage.  My unxz
> > function is a little slow, and I've noticed that if I specify the
> > function several times in a query, it gets run multiple times, even
> > though it's deterministic.  For example:
> >
> 
> Your example did not go through.  Please resend.  Use plain-text email
> for best results.
> 
> >
> >
> > In the above query, unxz is run three times, even though content
> > doesn't change within the same row.  Is there a way to tell SQLite
> > to only run a function once per row of results?  I looked into
> > https://sqlite.org/c3ref/get_auxdata.html, but it appears that
> > metadata is only stored for compile-time constants.
> >
> > Thanks,
> > Rob
> >
> >
> >
> >
> > --
> > Sent from: http://sqlite.1065341.n5.nabble.com/
> > ___
> > sqlite-users mailing list
> > sqlite-users@mailinglists.sqlite.org
> > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
> >
> 
> 


Here's the example:

> select json_ref(unxz(content), '$.user.id'), json_ref(unxz(content),
> '$.full_text) from archive where json_ref(unxz(content),
> '$.full_text') like '%dogs%';
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Move to Github!!?

2017-12-27 Thread Rob Willett


Dear all,

I've enjoyed reading about the Github vs Fossil discussion. There's been 
some passionate and provocative statements :)


What its done for me is make me think again about how we manage our 
source code control which is always good. We very recently moved from 
our existing text based RCS system to a privately hosted Git system. I 
know RCS is possibly older than the average reader on this mailing list, 
but it works very well for us and the workflow 'just works'. Until 
recently we haven't needed anything more so why change.


RCS started showing its age as we have moved to more and more people 
working out of the office more and more, often with no internet access, 
and we became more distributed. So after an extensive checking of the 
various SCM systems out there, which lasted almost a whole hour, we 
decided to try out Git on a private server. Private servers are pretty 
cheap and we didn't want to pay Github. We were also acutely aware that 
business models change on a whim and we didn't want to be tied into 
Github financially (small though it might be) and technically.


We have spent a month looking at how we use Git and we don't feel a 100% 
happy with it, mind you 75M other people seem to think its OK, but this 
mailing list started us thinking about Fossil. This mailing list has 
always been a source of accurate (and sometimes passionate) information. 
It has rarely (never?) had much 'fluff' in it. (Fluff is a English 
phrase meaning ephemeral or lightweight). Whilst I may not agree with 
everything or even understand most of the SQL, the quality is very, very 
high and when people state something, its normally worth listening to. 
We tend to lurk rather than contribute as other people know SQLite far, 
far better than we do. If you have any Perl questions please ask :)


It's pretty easy to setup Git on a VPS, took an hour or so before we had 
it going and it started well. What we then found was that the 
documentation and the commands were exceptionally confusing. It appeared 
to us that most of the documentation seems self referential and we kept 
going round and round in circles trying to understand why something was 
written the way it was. Even though 75M people seem to use Git, it was a 
struggle to understand it. We except that it might be us, but we felt 
like we were walking through treacle.


The next issue we found was that large binary files can be problematic. 
We have started making Twitter video ads to promote our apps. In the 
interest of fairness and not to self advertise I won't promote them here 
:) However we have 20GB of video and pictures we've created over the 
last few months. It's amazing how much space (and time) can be taken up 
with this sort of thing. We found that by itself Git doesn't really like 
binary files that much and that you tend to need an extension to handle 
Large Binary Files. There also appears to be different extensions to 
handle this sort of stuff which is painful to work out which is the best 
system. It could also be that we simply haven't got the hang of Git.


So we started looking again at Fossil and like what we saw. We have no 
emotional commitment to Git, we want an easy to use SCM that works with 
Emacs (I know, I know), and runs on Linux and Macs. Fossil looks like it 
might fit the bill. There's an Emacs mode so that box is ticked, there's 
a command line mode, so thats easy, the command set seems to be 
orthogonal which is important.


The one area we struggled with is how we might handle large binary files 
(video files). These are made in Final Cut Pro X (FCPX), please don't 
get us started on the backup capability of FCPX,, Screenflow, an 
excellent video editing system that doesn't try to do what FCPX does, 
but does 'simple' videos very well and is fantastically quick to work 
with, Affinity Designer and Affinity Photo. None of these really produce 
XML files so are genuine binary files. It appears that Fossil seems to 
handle binary files 
(https://www.fossil-scm.org/index.html/doc/trunk/www/fileformat.wiki) 
but we struggled to work out how these really work. We'd be very 
interested in peoples experiences here with Fossil with large binary 
files (300MB). It would be great to have a single repository where we 
can store text files, binary files and have these managed easily.


Am aware that this discussion might be better on the Fossil mailing 
list, but since there's so much information on here, we thought we'd ask 
this group first.


Thanks very much for reading,

Rob

On 26 Dec 2017, at 22:47, Jungle Boogie wrote:


On Tue 26 Dec 2017  3:08 PM, Damien Sykes wrote:

Hi,
This is a question I have asked myself many times (I.E. Git projects 
moving

to Fossil).
GitHub is well known and boasts over 74 million repositories, yet 
Fossil,
which both hosts and utilises one of the most well-kno

[sqlite] Error: ambiguous column name

2017-11-14 Thread Rob Golsteijn

Hi List,



Given a table created as:   create table aaa(a);

Sqlite reports an error "ambiguous column name: main.aaa.a" for the following 
query.


select * from aaa, aaa;
Error: ambiguous column name: main.aaa.a



And also for similar queries



select * from aaa INNER JOIN aaa;
select * from aaa CROSS JOIN aaa;

select * from aaa JOIN aaa;



Tested with sqlite version 3.21.0 and an old version 3.8.4.3.



I think the query is valid and should not result in an error. Typically Sqlite 
would name the resulting columns "a" and "a:1" in this case.

Workaround: add an alias for one of the tables in the join (both columns will 
be called "a").



Met Vriendelijke Groet, Kind Regards, 谨致问候,

Rob Golsteijn
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Can't find the source to sqldiff - And yes I have looked :)

2017-09-19 Thread Rob Willett

Thanks,

We have found it under sqlite-src-3200100. We clearly missed it.

Apologies

Rob

On 19 Sep 2017, at 9:18, Rowan Worth wrote:

> I don't have sqlite-src-3200100.zip on hand but it seems to be there in
> tool/sqldiff.c for sqlite-src-3081002 (and in the fossil repo).
>
> -Rowan
>
> On 19 September 2017 at 15:58, Rob Willett 
> wrote:
>
>> Hi,
>>
>> I'm trying to find the source to sqldiff and am struggling to locate it.
>>
>> I've looked on
>>
>> https://sqlite.org/download.html
>>
>> and downloaded
>>
>> sqlite-amalgamation-3200100.zip
>> sqlite-autoconf-3200100.tar.gz
>> sqlite-src-3200100.zip
>>
>> unzipped (or gunzipped) them all and still can't see sqldiff.
>>
>> I've also Googled for it and can see some tart replies saying its on the
>> sqlite download page. I can see that this page says its included in the
>> official distribution but I can't find it
>>
>> http://sqlite.1065341.n5.nabble.com/sqlitediff-td82394.html
>>
>> and can see from Richard Hipp that it *might* be here
>>
>> https://www.sqlite.org/src/artifact/0748c0daed08f31e
>>
>> I can't believe that the above link is the only way to get the sqldiff
>> program and is this the latest and greatest.
>>
>> Have I missed something important or a link somewhere? I've been looking
>> for the last hour and am struggling here to find something that should be
>> reasonably obvious.
>>
>> I need the source as I need to compile a 64bit version for a Linux box
>> that doesn't have the 32bit libraries on. Loading the 32 bit libraries is
>> not an option for us.
>>
>> Any help or suggestions welcomed.
>>
>> Thanks
>>
>> Rob
>> ___
>> sqlite-users mailing list
>> sqlite-users@mailinglists.sqlite.org
>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>>
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] Can't find the source to sqldiff - And yes I have looked :)

2017-09-19 Thread Rob Willett


Hi,

I'm trying to find the source to sqldiff and am struggling to locate it.

I've looked on

https://sqlite.org/download.html

and downloaded

sqlite-amalgamation-3200100.zip
sqlite-autoconf-3200100.tar.gz
sqlite-src-3200100.zip

unzipped (or gunzipped) them all and still can't see sqldiff.

I've also Googled for it and can see some tart replies saying its on the 
sqlite download page. I can see that this page says its included in the 
official distribution but I can't find it


http://sqlite.1065341.n5.nabble.com/sqlitediff-td82394.html

and can see from Richard Hipp that it *might* be here

https://www.sqlite.org/src/artifact/0748c0daed08f31e

I can't believe that the above link is the only way to get the sqldiff 
program and is this the latest and greatest.


Have I missed something important or a link somewhere? I've been looking 
for the last hour and am struggling here to find something that should 
be reasonably obvious.


I need the source as I need to compile a 64bit version for a Linux box 
that doesn't have the 32bit libraries on. Loading the 32 bit libraries 
is not an option for us.


Any help or suggestions welcomed.

Thanks

Rob
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] How can I lock a database?

2017-07-28 Thread Rob Richardson

I have a program reads data from 17 PLCs and writes it into SQLite databases 
every minutes.  Every midnight, it deletes old data and vacuums the databases.  
It's behaving strangely after that.  I think the problem begins because the 
vacuum operation is still going on at 12:01, when the next read is scheduled.  
SQlite throws a "database is locked" error.  

I tried to replicate this by writing a little program that vacuums one SQLite 
database repeatedly for three minutes.  The first time I tried to use it, the 
main program behaved as expected, showing the same odd behavior.  But the next 
two times I tried, my program sailed right through the "database is locked" 
error, running as it was designed with no problems.  

So, instead of running vacuum over and over again, I would like to do something 
that would lock my database against writing once, and then not release it for 
three minutes.  How can I do that?

Thank you very much.

RobR
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] SQLite Port/ODBC Driver Question

2017-07-10 Thread Rob Willett


It depends on what you mean remotely.

By itself SQLite doesn't have any networking library built in. It's an 
embedded database.


You can put application wrappers around the database, I believe that 
wrappers exist to make SQLIte into a true client/server but thats 
additional code. Also there is ODBC, but there's nothing (AFAIK) in the 
actual codebase itself that allows any remote connectivity.


Clearly you can put Apache/Nginx/PHP/SQlite into a software stack and 
make it work, we actually use Nginx/Mojolicious/SQLite as our platform 
stack but there's nothing in there that allows any remote access to 
SQLite.


If you are talking about hosting the database on a network volume, I 
would recommend that you read this


https://sqlite.org/whentouse.html

The very first paragraph states what Sqlite can do. I would also pay 
close attention to
"If there are many client programs sending SQL to the same database over 
a network, then use a client/server database engine instead of SQLite. 
SQLite will work over a network filesystem, but because of the latency 
associated with most network filesystems, performance will not be great. 
Also, file locking logic is buggy in many network filesystem 
implementations (on both Unix and Windows). If file locking does not 
work correctly, two or more clients might try to modify the same part of 
the same database at the same time, resulting in corruption. Because 
this problem results from bugs in the underlying filesystem 
implementation, there is nothing SQLite can do to prevent it.


A good rule of thumb is to avoid using SQLite in situations where the 
same database will be accessed directly (without an intervening 
application server) and simultaneously from many computers over a 
network."


Just my 2p worth,

Rob

On 10 Jul 2017, at 14:14, Igor Korot wrote:


Rob,

On Mon, Jul 10, 2017 at 7:06 AM, Rob Willett
 wrote:

Vishal,

SQLite isn't a traditional client/server relational database, 
therefore

there isn't a port to open up. It runs on a local machine.


I believe SQLite can successfully be run remotely.

Thank you.



Now there are wrappers around SQLite to extend it, I assume this ODBC 
driver

is one of them.

I suspect people here *may* know the answer regarding any ports the 
ODBC
driver uses, but you may be better off asking the maintainer of the 
ODBC

driver.

Rob


On 10 Jul 2017, at 1:31, Shukla, Vishal wrote:


Hi,
Am trying to open a firewall to the machine having sqlite database. 
Does
the SQLite database use a specific port number ? If not, then does 
the ODBC

connection to SQLite using ODBC driver use a port ?

Any help will be greatly appreciated.

SQLite ODBC Driver:
http://www.ch-werner.de/sqliteodbc/sqliteodbc.exe

Regards,
Vishal Shukla

Confidential communication
Westpac Banking Corporation (ABN 33 007 457 141)
Westpac Institutional Bank is a division of Westpac Banking 
Corporation

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] SQLite Port/ODBC Driver Question

2017-07-10 Thread Rob Willett


Vishal,

SQLite isn't a traditional client/server relational database, therefore 
there isn't a port to open up. It runs on a local machine.


Now there are wrappers around SQLite to extend it, I assume this ODBC 
driver is one of them.


I suspect people here *may* know the answer regarding any ports the ODBC 
driver uses, but you may be better off asking the maintainer of the ODBC 
driver.


Rob

On 10 Jul 2017, at 1:31, Shukla, Vishal wrote:


Hi,
Am trying to open a firewall to the machine having sqlite database. 
Does the SQLite database use a specific port number ? If not, then 
does the ODBC connection to SQLite using ODBC driver use a port ?


Any help will be greatly appreciated.

SQLite ODBC Driver:
http://www.ch-werner.de/sqliteodbc/sqliteodbc.exe

Regards,
Vishal Shukla

Confidential communication
Westpac Banking Corporation (ABN 33 007 457 141)
Westpac Institutional Bank is a division of Westpac Banking 
Corporation

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] documentation flaws

2017-06-02 Thread Rob Golsteijn

Hi List,



I found 2 flaws in the online documentation:

Documentation conflict:

Section "1.3. Limitations" on http://sqlite.org/sessionintro.html 
<http://sqlite.org/sessionintro.html> states:
"• Prior to SQLite version 3.17.0, the session extension only worked with rowid 
tables, not WITHOUT ROWID tables. As of 3.17.0, both rowid and WITHOUT ROWID 
tables are supported."

This contradicts with the statement in item 7 of Section "2. Differences From 
Ordinary Rowid Tables" on page http://sqlite.org/withoutrowid.html 
<http://sqlite.org/withoutrowid.html> 
"Note that since the session extension uses the update hook, that means that 
the session extension will not work correctly on a database that includes 
WITHOUT ROWID tables."



The latter remark is probably outdated, or needs some refinement.



Secondly, a typo:
"changset" should be "changeset" in first line of Section "2.2. Conflicts" on 
page http://sqlite.org/sessionintro.html <http://sqlite.org/sessionintro.html> 



Met Vriendelijke Groet, Kind Regards, 谨致问候,

Rob Golsteijn
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] -shm and -wal files left after trivially simple select statement from command line

2017-05-15 Thread Rob Willett


Richard, J Decker,

Just to close this down.

The command was

select * from device;

We used .quit and Ctrl-D to exit the shell

The command sequence was as simple as

sqlite3 DATABASE.sqlite

We knew about things like less and head and so we avoided them.

The version of Sqlite seems to be the problem, the standard one with 
Ubuntu 14.04 LTS is 3.8.2 which we know is old, but until now seems to 
have been fine.


Upgrading our production servers to Ubuntu 16 LTS is a massive 
undertaking. We have nine of them with test and preprod so upgrading and 
testing everything again is many, many weeks work :(


We've upgraded Sqlite CLI to the latest 3.18.0 and are trying this out 
in a nice safe environment.


Thanks

Rob

On 15 May 2017, at 12:38, Richard Hipp wrote:


On 5/15/17, Richard Hipp  wrote:

On 5/15/17, Rob Willett  wrote:


As part of the testing we noticed that -shm and -wal files were 
being
left after we used sqlite3 on the command line. This puzzled us as 
we

didn't see any errors in our test scripts.


How is the command being ended.  Are you inserting a ".quit" command?
Are you somehow causing the shell to terminate prematurely, perhaps 
by
piping the output into a filter (such as "head") that closes the 
input

connection early?  What version of the SQLite shell is running?  What
is the query?


Never mind - it appears that JDecker already knew the problem and the
solution

--
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] -shm and -wal files left after trivially simple select statement from command line

2017-05-15 Thread Rob Willett

Thanks for this. This seems to be the old version of SQLite that comes 
with Ubuntu. We'll upgrade.


Just to be clear we do close the connection as all we do is open the 
database using sqlite3 from the command line, do a select and close 
sqlite3 from the command line.


We do not do anything that modifies the database.

We'll have to upgrade the version of SQLite, test and then upgrade our 
database.


Thanks

Rob

On 15 May 2017, at 12:09, J Decker wrote:

that was an old issue... current sqlite version do not have this 
issue.

You MUST close the connection though, for the files to get deleted.

http://sqlite.1065341.n5.nabble.com/journal-files-not-always-removed-td83700.html#a83705

On Mon, May 15, 2017 at 3:01 AM, Rob Willett 


wrote:


Hi,

We've encountered what we think is an odd situation and we can't find 
any
explanation for why this is. We're also not sure if its a problem or 
not.


A brief summary is that we are doing a major database upgrade and are
doing (lots) of testing. The box is a Linux box running Ubuntu 14.04 
LTS
Server. This is a standard box we have used for a few years, we 
*might*

move to 16 LTS shortly.

As part of the testing we noticed that -shm and -wal files were being 
left
after we used sqlite3 on the command line. This puzzled us as we 
didn't see

any errors in our test scripts.

We then narrowed this problem down to doing a simple SELECT statement 
on a

table.

e.g. This is our starting state. Note that nothing is touching these
database as they are isolated in a directory, so there is no other 
process

playing with them.

root@demonotifications:/jambuster/notifications/upgrade_test# ls -lt
total 66892
-rw-r--r-- 1 root root 34217984 May 15 10:38 accountmanagement.sqlite
-rw-r--r-- 1 root root58567 May 15 10:31 upgradedb.sql
-rw-r--r-- 1 root root 34217984 May 13 20:38 
accountmanagement.sqlite.orig


root@demonotifications:/jambuster/notifications/upgrade_test# cp
accountmanagement.sqlite.orig accountmanagement.sqlite

root@demonotifications:/jambuster/notifications/upgrade_test# sqlite3
accountmanagement.sqlite
-- Loading resources from /root/.sqliterc

SQLite version 3.8.2 2013-12-06 14:53:30
Enter ".help" for instructions
Enter SQL statements terminated with a ";"
sqlite> select * from device;
UUID isValid 
Version

   WhatAmI Application
---  --
--  --  --
<  1   1.1
   ios_app JambusterForLondon
.. Hundreds more lines of device data which looks correct
<  1   1.1
   ios_app JambusterForLondon

sqlite> .exit
root@demonotifications:/jambuster/notifications/upgrade_test# ls -lt
total 66924
-rw-r--r-- 1 root root32768 May 15 10:48 
accountmanagement.sqlite-shm
-rw-r--r-- 1 root root0 May 15 10:48 
accountmanagement.sqlite-wal

-rw-r--r-- 1 root root 34217984 May 15 10:47 accountmanagement.sqlite
-rw-r--r-- 1 root root58567 May 15 10:31 upgradedb.sql
-rw-r--r-- 1 root root 34217984 May 13 20:38 
accountmanagement.sqlite.orig

root@demonotifications:/jambuster/notifications/upgrade_test


As can be seen above, we have accountmanagement.sqlite-shm and
accountmanagement.sqlite-wal files left which is odd. All of this 
from a

select statement.

If we do an pragma integrity check we get

  integrity_check
  ---
  ok

All the reading we have done seems to indicate that having -shm and 
-wal
files indicates an error, if this was our code, we'd be looking deep 
into
that but a simple select statement shouldn't cause this sort or 
error, can

it?

The select statement has a number of foreign key delete cascades but 
thats
it, and as we are not deleting anything this shouldn't be an issue 
anyway.


Do we even have a problem? Going back into the database and closing 
it
again, just with the command line utility sqlite3 doesn't clear the 
files.


Any help or suggestions welcomed.

Thanks

Rob
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] -shm and -wal files left after trivially simple select statement from command line

2017-05-15 Thread Rob Willett


Hi,

We've encountered what we think is an odd situation and we can't find 
any explanation for why this is. We're also not sure if its a problem or 
not.


A brief summary is that we are doing a major database upgrade and are 
doing (lots) of testing. The box is a Linux box running Ubuntu 14.04 LTS 
Server. This is a standard box we have used for a few years, we *might* 
move to 16 LTS shortly.


As part of the testing we noticed that -shm and -wal files were being 
left after we used sqlite3 on the command line. This puzzled us as we 
didn't see any errors in our test scripts.


We then narrowed this problem down to doing a simple SELECT statement on 
a table.


e.g. This is our starting state. Note that nothing is touching these 
database as they are isolated in a directory, so there is no other 
process playing with them.


root@demonotifications:/jambuster/notifications/upgrade_test# ls -lt
total 66892
-rw-r--r-- 1 root root 34217984 May 15 10:38 accountmanagement.sqlite
-rw-r--r-- 1 root root58567 May 15 10:31 upgradedb.sql
-rw-r--r-- 1 root root 34217984 May 13 20:38 
accountmanagement.sqlite.orig


root@demonotifications:/jambuster/notifications/upgrade_test# cp 
accountmanagement.sqlite.orig accountmanagement.sqlite


root@demonotifications:/jambuster/notifications/upgrade_test# sqlite3 
accountmanagement.sqlite

-- Loading resources from /root/.sqliterc

SQLite version 3.8.2 2013-12-06 14:53:30
Enter ".help" for instructions
Enter SQL statements terminated with a ";"
sqlite> select * from device;
UUID isValid Version 
WhatAmI Application
---  --  
--  --  --
<  1   1.1 
ios_app JambusterForLondon

.. Hundreds more lines of device data which looks correct
<  1   1.1 
ios_app JambusterForLondon


sqlite> .exit
root@demonotifications:/jambuster/notifications/upgrade_test# ls -lt
total 66924
-rw-r--r-- 1 root root32768 May 15 10:48 
accountmanagement.sqlite-shm
-rw-r--r-- 1 root root0 May 15 10:48 
accountmanagement.sqlite-wal

-rw-r--r-- 1 root root 34217984 May 15 10:47 accountmanagement.sqlite
-rw-r--r-- 1 root root58567 May 15 10:31 upgradedb.sql
-rw-r--r-- 1 root root 34217984 May 13 20:38 
accountmanagement.sqlite.orig

root@demonotifications:/jambuster/notifications/upgrade_test


As can be seen above, we have accountmanagement.sqlite-shm and 
accountmanagement.sqlite-wal files left which is odd. All of this from a 
select statement.


If we do an pragma integrity check we get

  integrity_check
  ---
  ok

All the reading we have done seems to indicate that having -shm and -wal 
files indicates an error, if this was our code, we'd be looking deep 
into that but a simple select statement shouldn't cause this sort or 
error, can it?


The select statement has a number of foreign key delete cascades but 
thats it, and as we are not deleting anything this shouldn't be an issue 
anyway.


Do we even have a problem? Going back into the database and closing it 
again, just with the command line utility sqlite3 doesn't clear the 
files.


Any help or suggestions welcomed.

Thanks

Rob
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Difference between localtime and utc is 8 hours, but should be 4

2017-04-21 Thread Rob Richardson

That makes sense.  Thank you very much.

RobR

-Original Message-
From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On 
Behalf Of Stephen Chrzanowski
Sent: Friday, April 21, 2017 4:29 PM
To: SQLite mailing list
Subject: Re: [sqlite] Difference between localtime and utc is 8 hours, but 
should be 4

Let me clarify;

What you store in the database is just a number.  There is no indication to 
what timezone that references.  So when you convert UTC to UTC, you're taking a 
time already set for UTC and converting it to another 4 hours earlier (Or 
later? -- i hate time math).  When I say "UTC to UTC", the first UTC isn't 
actually UTC, but assumed local.

From https://sqlite.org/lang_datefunc.html under Modifiers section:

The "localtime" modifier (12) assumes the time string to its left is in 
Universal Coordinated Time (UTC) and adjusts the time string so that it 
displays localtime. If "localtime" follows a time that is not UTC, then the 
behavior is undefined. The "utc" modifier is the opposite of "localtime".
"utc" assumes that the string to its left is in the local timezone and adjusts 
that string to be in UTC. If the prior string is not in localtime, then the 
result of "utc" is undefined.

On Fri, Apr 21, 2017 at 4:24 PM, Stephen Chrzanowski 
wrote:

> Because you're converting your UTC time to UTC.
>
> On Fri, Apr 21, 2017 at 4:03 PM, Rob Richardson 
> 
> wrote:
>
>> Hello!
>>
>> I'm in the Eastern US time zone, in daylight savings time.  I am four 
>> hours earlier than UTC time.  I have a column that stores UTC times 
>> as Julian times (floating-point numbers).  The latest data point in 
>> the table was stored at about 8:41 this morning (4/21).
>>
>> I am getting strange results from this query:
>> select max(value_timestamp),
>> datetime(max(julianday(value_timestamp)), 'localtime'), 
>> datetime(max(julianday(value_timestamp)), 'utc') from trend_data
>>
>> The results are:
>> 2457864.86179398
>> 2017-04-21 04:40:59
>> 2017-04-21 12:40:59
>>
>> How is it that switching from local time to UTC gives an eight-hour 
>> difference?
>>
>> Thank you very much.
>>
>> RobR
>> ___
>> sqlite-users mailing list
>> sqlite-users@mailinglists.sqlite.org
>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>>
>
>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] Difference between localtime and utc is 8 hours, but should be 4

2017-04-21 Thread Rob Richardson

Hello!

I'm in the Eastern US time zone, in daylight savings time.  I am four hours 
earlier than UTC time.  I have a column that stores UTC times as Julian times 
(floating-point numbers).  The latest data point in the table was stored at 
about 8:41 this morning (4/21).  

I am getting strange results from this query:
select max(value_timestamp), 
datetime(max(julianday(value_timestamp)), 'localtime'),
datetime(max(julianday(value_timestamp)), 'utc') from trend_data

The results are: 
2457864.86179398
2017-04-21 04:40:59
2017-04-21 12:40:59

How is it that switching from local time to UTC gives an eight-hour difference?

Thank you very much.

RobR
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Yet Another Why Doesn't Sqlite Use My Index question ...

2017-03-20 Thread Rob Willett


Just to close this thread down completely.

We've implemented the changes on our production server and the actual 
performance increase is even better than we predicted.


Our production server is a containerised Ubuntu Server. Its hardware 
configuration is significantly different to our Macs on which we 
develop. Our Macs have more RAM and SSD disks. The production server has 
other advantages though :)


The query took 90 mins on our production server, this is mainly due to 
slower disks than we have locally, so this was not unexpected.


After doing the performance changes (removing unneeded fields and adding 
in a covering index), we went down to 38 secs.


This has meant we are no longer looking at a second database just for 
analytics, we just needed to learn to use the database we already had :)


Rob

On 18 Mar 2017, at 5:48, Rob Willett wrote:

We've just implemented a covering index for the last step (is it 
really?) in our quest to get the query execution time down.


To summarise we have gone from 32 mins to 16 mins by updating an index 
so it doesn't use collate, we took another six minutes off by removing 
extra fields in the select we didn't need.


We have just created a new index which 'covers' all the fields we use 
in the select, this means (and I paraphrase) that we use the index to 
get all the data and there is no need to read from the database.


Well that was a bit of a surprise, the index creation took 45 mins, we 
ran the program again and thought, rats, we've cocked it up, it only 
took 54 secs, we got something wrong. So we checked it and checked 
again and we hasn't got anything wrong. Our query has moved from 32 
mins to 54 secs.


We're quite happy with that performance increase. In fact we're 
delighted, so thanks for all the help in getting us to this stage.


We have kept copies of the query planner bytecode output if anybody is 
interested. Gunter has had copies, but if anybody else would like 
them, please ask.


Many thanks again for all the help,

Rob

On 17 Mar 2017, at 22:12, Rob Willett wrote:


Dear all,

We progress steadily forward.

1. We immediately halved our execution time by moving to an updated 
index that doesn't use COLLATE. Thats 32mins to 16mins.


2. We then shaved a further six minutes off the execution time by 
removing extraneous fields in the select statement, so instead of 
"select * ...", we identified which fields we used and directly 
selected those. So we are now down to 10 mins or 1/3 of when we 
started for, to be honest, virtually no extra work, merely being 
smarter, or rather you being smarter.


3. We have looked through all our indexes and can see that every 
index has a COLLATE against it, even if the column is an integer. We 
have raised a support call with Navicat.


4. The next step is to create a "covering index" to try and get the 
whole of the query into the index. However its 22:11 in London and I 
need to get home.


Thanks very much for the help so far. Tomorrow is more tricky but 
I'll read up on covering indexes to see how to use them,.


Rob

On 17 Mar 2017, at 18:39, Simon Slavin wrote:

On 17 Mar 2017, at 6:22pm, Rob Willett 
 wrote:


4. Work through returning just the columns we actually need from 
our queries. We have a recollection that if we can build an index 
with all the information necessary in it, we can do all the work in 
joins rather than paging out to disk. Is this what you are 
referring to?


It works only where all the columns you need to read are in the same 
table.  The ideal form of a covering index is to have the columns 
listed in this order:


1) columns needed for the WHERE clause
2) columns needed for the ORDER BY clause which aren’t in (1)
3) columns needed to be read which aren’t in (2) or (1)

SQLite detects that all the information it needs for the SELECT is 
available from the index, so it doesn’t bother to read the table 
at all.  This can lead to something like a doubling of speed.  Of 
course, you sacrifice filespace, and making changes to the table 
takes a little longer.


5. Sleep (not exactly sure when) and watch three international 
rugby games tomorrow.


Sleep while waiting for indexes to be created and ANALYZE to work.  
May you see skilled players, creative moves and dramatic play.


Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Yet Another Why Doesn't Sqlite Use My Index question ...

2017-03-20 Thread Rob Willett


David,

We're English and masters of the understatement ;)

Rob

On 20 Mar 2017, at 15:04, David Raymond wrote:


"... Our query has moved from 32 mins to 54 secs.

We're quite happy with that performance increase."

I have to admit that the hearty chuckle which that statement produced 
from me hurt my sore throat. Totally worth it though.

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Yet Another Why Doesn't Sqlite Use My Index question ...

We've just implemented a covering index for the last step (is it 
really?) in our quest to get the query execution time down.


To summarise we have gone from 32 mins to 16 mins by updating an index 
so it doesn't use collate, we took another six minutes off by removing 
extra fields in the select we didn't need.


We have just created a new index which 'covers' all the fields we use in 
the select, this means (and I paraphrase) that we use the index to get 
all the data and there is no need to read from the database.


Well that was a bit of a surprise, the index creation took 45 mins, we 
ran the program again and thought, rats, we've cocked it up, it only 
took 54 secs, we got something wrong. So we checked it and checked again 
and we hasn't got anything wrong. Our query has moved from 32 mins to 54 
secs.


We're quite happy with that performance increase. In fact we're 
delighted, so thanks for all the help in getting us to this stage.


We have kept copies of the query planner bytecode output if anybody is 
interested. Gunter has had copies, but if anybody else would like them, 
please ask.


Many thanks again for all the help,

Rob

On 17 Mar 2017, at 22:12, Rob Willett wrote:


Dear all,

We progress steadily forward.

1. We immediately halved our execution time by moving to an updated 
index that doesn't use COLLATE. Thats 32mins to 16mins.


2. We then shaved a further six minutes off the execution time by 
removing extraneous fields in the select statement, so instead of 
"select * ...", we identified which fields we used and directly 
selected those. So we are now down to 10 mins or 1/3 of when we 
started for, to be honest, virtually no extra work, merely being 
smarter, or rather you being smarter.


3. We have looked through all our indexes and can see that every index 
has a COLLATE against it, even if the column is an integer. We have 
raised a support call with Navicat.


4. The next step is to create a "covering index" to try and get the 
whole of the query into the index. However its 22:11 in London and I 
need to get home.


Thanks very much for the help so far. Tomorrow is more tricky but I'll 
read up on covering indexes to see how to use them,.


Rob

On 17 Mar 2017, at 18:39, Simon Slavin wrote:

On 17 Mar 2017, at 6:22pm, Rob Willett  
wrote:


4. Work through returning just the columns we actually need from our 
queries. We have a recollection that if we can build an index with 
all the information necessary in it, we can do all the work in joins 
rather than paging out to disk. Is this what you are referring to?


It works only where all the columns you need to read are in the same 
table.  The ideal form of a covering index is to have the columns 
listed in this order:


1) columns needed for the WHERE clause
2) columns needed for the ORDER BY clause which aren’t in (1)
3) columns needed to be read which aren’t in (2) or (1)

SQLite detects that all the information it needs for the SELECT is 
available from the index, so it doesn’t bother to read the table at 
all.  This can lead to something like a doubling of speed.  Of 
course, you sacrifice filespace, and making changes to the table 
takes a little longer.


5. Sleep (not exactly sure when) and watch three international rugby 
games tomorrow.


Sleep while waiting for indexes to be created and ANALYZE to work.  
May you see skilled players, creative moves and dramatic play.


Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Yet Another Why Doesn't Sqlite Use My Index question ...


Dear all,

We progress steadily forward.

1. We immediately halved our execution time by moving to an updated 
index that doesn't use COLLATE. Thats 32mins to 16mins.


2. We then shaved a further six minutes off the execution time by 
removing extraneous fields in the select statement, so instead of 
"select * ...", we identified which fields we used and directly selected 
those. So we are now down to 10 mins or 1/3 of when we started for, to 
be honest, virtually no extra work, merely being smarter, or rather you 
being smarter.


3. We have looked through all our indexes and can see that every index 
has a COLLATE against it, even if the column is an integer. We have 
raised a support call with Navicat.


4. The next step is to create a "covering index" to try and get the 
whole of the query into the index. However its 22:11 in London and I 
need to get home.


Thanks very much for the help so far. Tomorrow is more tricky but I'll 
read up on covering indexes to see how to use them,.


Rob

On 17 Mar 2017, at 18:39, Simon Slavin wrote:

On 17 Mar 2017, at 6:22pm, Rob Willett  
wrote:


4. Work through returning just the columns we actually need from our 
queries. We have a recollection that if we can build an index with 
all the information necessary in it, we can do all the work in joins 
rather than paging out to disk. Is this what you are referring to?


It works only where all the columns you need to read are in the same 
table.  The ideal form of a covering index is to have the columns 
listed in this order:


1) columns needed for the WHERE clause
2) columns needed for the ORDER BY clause which aren’t in (1)
3) columns needed to be read which aren’t in (2) or (1)

SQLite detects that all the information it needs for the SELECT is 
available from the index, so it doesn’t bother to read the table at 
all.  This can lead to something like a doubling of speed.  Of course, 
you sacrifice filespace, and making changes to the table takes a 
little longer.


5. Sleep (not exactly sure when) and watch three international rugby 
games tomorrow.


Sleep while waiting for indexes to be created and ANALYZE to work.  
May you see skilled players, creative moves and dramatic play.


Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Yet Another Why Doesn't Sqlite Use My Index question ...


Simon,

We're still benchmarking the various changes we've made during the day. 
I'm sitting here watching a tail of a log file waiting for it to finish. 
Ah the joys of the internet startup :)


Our plan of action is

1. Have baseline from which we can work from. We now have that.

2. Create the correct sub-query index and work out the (hopefully) 
increase in speed.


3. Work through all the remaining indexes and check that we have not 
made the same mistake. I know we actually have :( Keep rerunning our 
benchmark so we know if we are actually making a difference.


4. Work through returning just the columns we actually need from our 
queries. We have a recollection that if we can build an index with all 
the information necessary in it, we can do all the work in joins rather 
than paging out to disk. Is this what you are referring to?


5. Sleep (not exactly sure when) and watch three international rugby 
games tomorrow.


Rob

On 17 Mar 2017, at 18:15, Simon Slavin wrote:

On 17 Mar 2017, at 5:30pm, Rob Willett  
wrote:


echo "select * from Disruptions where status = 2 OR status = 6;" | 
sqlite3 tfl.sqlite > /dev/null


twice and each run as 12 mins. So we were twice as quick, which is 
nice.


Do you actually need all columns ?  If not, then specifying the 
columns you need can lead to a further speedup.  It might be enough 
just to specify the columns you need, but you can achieve further 
increases in speed by making a covering index.  If speed for this 
SELECT is sufficiently important to you, and you don’t actually need 
all columns, post again and we’ll explain further.


Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Yet Another Why Doesn't Sqlite Use My Index question ...


Thanks to everybody for their help earlier today.

As promised here's the results of our various tests. Hopefully they may 
be of use to somebody...


We decided to start from a known position and so recreated the original 
index with the collation in it. We know this was sub optimal but its our 
reference point. We have the bytecode output if anybody wants to see it.


CREATE INDEX "Disruptions_idx4" ON Disruptions ("status" COLLATE NOCASE 
ASC);


We ran the the following SQL twice

echo "select * from Disruptions where status = 2 OR status = 6;" | 
sqlite3 tfl.sqlite > /dev/null


and the two runs totalled 46 mins. Each was actually 23 mins.

We then dropped the old index, built the new one

echo 'CREATE INDEX "Disruptions_idx4" ON Disruptions ("status");' | 
sqlite3 tfl.sqlite


We ran

echo "select * from Disruptions where status = 2 OR status = 6;" | 
sqlite3 tfl.sqlite > /dev/null


twice and each run as 12 mins. So we were twice as quick, which is nice.

We then ran

echo "explain select * from Disruptions where status = 2 UNION ALL 
select * from Disruptions where status = 6;" | sqlite3 tfl.sqlite


twice. Each run was around 11.5 mins. We're not going to get into 
differences of less than a minute on a run of this size, so we'll say 
they are about the same speed.


Interesting results, clearly the collation does make a big difference. 
We are now going to go through the schema and check if we have made the 
same mistake elsewhere.


Thanks for your help, we can post the bytecode it people are interested.

Rob

On 17 Mar 2017, at 11:41, Rob Willett wrote:


Gunter,

I would never presume to describe anybody as a Nerd!

We're just going back to very first position with the 'bad' collation 
index so we can do proper timings as we change things so we understand 
the speed up (we hope there is a speed up)


We've written a quick script to check each version. Once we've put the 
original index back in, we've added a step to generate the SQLite 
bytecode for you. It's the least we can do...


We'll post this when its completed but we suspect it may take most of 
the day now :)


echo "Using Index 'CREATE INDEX "Disruptions_idx4" ON Disruptions 
("status" COLLATE NOCASE ASC);'"
echo "explain select * from Disruptions where status = 2 OR status = 
6;" | sqlite3 tfl.sqlite

date
echo "select * from Disruptions where status = 2 OR status = 6;" | 
sqlite3 tfl.sqlite > /dev/null

date
echo "select * from Disruptions where status = 2 OR status = 6;" | 
sqlite3 tfl.sqlite > /dev/null

date

echo "---"

echo "Creating new index without collation"
echo "drop index Disruptions_idx4;" | sqlite3 tfl.sqlite
echo 'CREATE INDEX "Disruptions_idx4" ON Disruptions ("status");' | 
sqlite3 tfl.sqlite
echo "explain select * from Disruptions where status = 2 OR status = 
6;" | sqlite3 tfl.sqlite

date
echo "select * from Disruptions where status = 2 OR status = 6;" | 
sqlite3 tfl.sqlite > /dev/null

date
echo "select * from Disruptions where status = 2 OR status = 6;" | 
sqlite3 tfl.sqlite > /dev/null

date

echo "---"

echo "Trying SELECT statement with UNION ALL"
echo "explain select * from Disruptions where status = 2 OR status = 
6;" | sqlite3 tfl.sqlite

date
echo "select * from Disruptions  where status = 2 UNION ALL select * 
from Disruptions where status = 6;" | sqlite3 tfl.sqlite > /dev/null

date
echo "select * from Disruptions  where status = 2 UNION ALL select * 
from Disruptions where status = 6;" | sqlite3 tfl.sqlite > /dev/null

date


On 17 Mar 2017, at 11:30, Hick Gunter wrote:

Nerds with chronic byte code affinity like myself would like to see 
the output of "explain" (without "query plan"), i.e. the SQLite 
bytecode produced. I guess the query with OR will have a subprogram 
called once for each status value, whereas I expect the query with 
UNION ALL to have 2 copies of the search (which would not affect the 
run time) and maybe even a temporary table of results (which would 
take longer and use more memory).


-Ursprüngliche Nachricht-
Von: sqlite-users 
[mailto:sqlite-users-boun...@mailinglists.sqlite.org] Im Auftrag von 
Rob Willett

Gesendet: Freitag, 17. März 2017 12:19
An: SQLite mailing list 
Betreff: Re: [sqlite] Yet Another Why Doesn't Sqlite Use My Index 
question ...


Gunter, Simon,

Thanks for the replies, You both seem to be drilling into the 
collation sequence as a possible issue. We now have a new index and 
we have just run the query again


sqlite> analyze;
sqlite> drop index Disruptions_idx4;
sqlite> CREATE INDEX "Disruptions_idx4" ON Disruptions ("status&q

Re: [sqlite] Yet Another Why Doesn't Sqlite Use My Index question ...


Gunter,

I would never presume to describe anybody as a Nerd!

We're just going back to very first position with the 'bad' collation 
index so we can do proper timings as we change things so we understand 
the speed up (we hope there is a speed up)


We've written a quick script to check each version. Once we've put the 
original index back in, we've added a step to generate the SQLite 
bytecode for you. It's the least we can do...


We'll post this when its completed but we suspect it may take most of 
the day now :)


echo "Using Index 'CREATE INDEX "Disruptions_idx4" ON Disruptions 
("status" COLLATE NOCASE ASC);'"
echo "explain select * from Disruptions where status = 2 OR status = 6;" 
| sqlite3 tfl.sqlite

date
echo "select * from Disruptions where status = 2 OR status = 6;" | 
sqlite3 tfl.sqlite > /dev/null

date
echo "select * from Disruptions where status = 2 OR status = 6;" | 
sqlite3 tfl.sqlite > /dev/null

date

echo "---"

echo "Creating new index without collation"
echo "drop index Disruptions_idx4;" | sqlite3 tfl.sqlite
echo 'CREATE INDEX "Disruptions_idx4" ON Disruptions ("status");' | 
sqlite3 tfl.sqlite
echo "explain select * from Disruptions where status = 2 OR status = 6;" 
| sqlite3 tfl.sqlite

date
echo "select * from Disruptions where status = 2 OR status = 6;" | 
sqlite3 tfl.sqlite > /dev/null

date
echo "select * from Disruptions where status = 2 OR status = 6;" | 
sqlite3 tfl.sqlite > /dev/null

date

echo "---"

echo "Trying SELECT statement with UNION ALL"
echo "explain select * from Disruptions where status = 2 OR status = 6;" 
| sqlite3 tfl.sqlite

date
echo "select * from Disruptions  where status = 2 UNION ALL select * 
from Disruptions where status = 6;" | sqlite3 tfl.sqlite > /dev/null

date
echo "select * from Disruptions  where status = 2 UNION ALL select * 
from Disruptions where status = 6;" | sqlite3 tfl.sqlite > /dev/null

date


On 17 Mar 2017, at 11:30, Hick Gunter wrote:

Nerds with chronic byte code affinity like myself would like to see 
the output of "explain" (without "query plan"), i.e. the SQLite 
bytecode produced. I guess the query with OR will have a subprogram 
called once for each status value, whereas I expect the query with 
UNION ALL to have 2 copies of the search (which would not affect the 
run time) and maybe even a temporary table of results (which would 
take longer and use more memory).


-Ursprüngliche Nachricht-
Von: sqlite-users 
[mailto:sqlite-users-boun...@mailinglists.sqlite.org] Im Auftrag von 
Rob Willett

Gesendet: Freitag, 17. März 2017 12:19
An: SQLite mailing list 
Betreff: Re: [sqlite] Yet Another Why Doesn't Sqlite Use My Index 
question ...


Gunter, Simon,

Thanks for the replies, You both seem to be drilling into the 
collation sequence as a possible issue. We now have a new index and we 
have just run the query again


sqlite> analyze;
sqlite> drop index Disruptions_idx4;
sqlite> CREATE INDEX "Disruptions_idx4" ON Disruptions ("status");
sqlite> explain query plan select * from Disruptions where status = 2 
OR

status = 6;
selectid|order|from|detail
0|0|0|SEARCH TABLE Disruptions USING INDEX Disruptions_idx4 (status=?)
0|0|0|EXECUTE LIST SUBQUERY 1
sqlite>

So we have a different response from the query planner, which I think 
is good.


If we use your other example

sqlite> explain query plan select * from Disruptions  where status = 2
UNION ALL select * from Disruptions where status = 6;
selectid|order|from|detail
1|0|0|SEARCH TABLE Disruptions USING INDEX Disruptions_idx4 (status=?)
2|0|0|SEARCH TABLE Disruptions USING INDEX Disruptions_idx4 (status=?)
0|0|0|COMPOUND SUBQUERIES 1 AND 2 (UNION ALL)

I'm not sure which query is going to be faster. We'll have to try and 
see.


Your last suggestion of "select * from Disruptions  where status =2 
COLLATE NOCASE or status = 6 COLLATE NOCASE" appears to be logically 
equivalent to "explain query plan select * from Disruptions where 
status = 2 OR status = 6;" now we have removed the collation from the 
index.


sqlite> explain query plan select * from Disruptions  where status =2
COLLATE NOCASE or status = 6 COLLATE NOCASE;
selectid|order|from|detail
0|0|0|SEARCH TABLE Disruptions USING INDEX Disruptions_idx4 (status=?)
0|0|0|EXECUTE LIST SUBQUERY 1
sqlite>

I'll check if we require all the fields, we require many (which I 
agree is not all) of the fields. Following this logic through, does 
this mean that it will do more file access bringing the records in 
from the file system?


The collation issue seems to be an artifact of the way Navcat for 
SQLite works. I suspect we need to be more carefu

Re: [sqlite] Yet Another Why Doesn't Sqlite Use My Index question ...


Gunter, Simon,

Thanks for the replies, You both seem to be drilling into the collation 
sequence as a possible issue. We now have a new index and we have just 
run the query again


sqlite> analyze;
sqlite> drop index Disruptions_idx4;
sqlite> CREATE INDEX "Disruptions_idx4" ON Disruptions ("status");
sqlite> explain query plan select * from Disruptions where status = 2 OR 
status = 6;

selectid|order|from|detail
0|0|0|SEARCH TABLE Disruptions USING INDEX Disruptions_idx4 (status=?)
0|0|0|EXECUTE LIST SUBQUERY 1
sqlite>

So we have a different response from the query planner, which I think is 
good.


If we use your other example

sqlite> explain query plan select * from Disruptions  where status = 2 
UNION ALL select * from Disruptions where status = 6;

selectid|order|from|detail
1|0|0|SEARCH TABLE Disruptions USING INDEX Disruptions_idx4 (status=?)
2|0|0|SEARCH TABLE Disruptions USING INDEX Disruptions_idx4 (status=?)
0|0|0|COMPOUND SUBQUERIES 1 AND 2 (UNION ALL)

I'm not sure which query is going to be faster. We'll have to try and 
see.


Your last suggestion of "select * from Disruptions  where status =2 
COLLATE NOCASE or status = 6 COLLATE NOCASE" appears to be logically 
equivalent to "explain query plan select * from Disruptions where status 
= 2 OR status = 6;" now we have removed the collation from the index.


sqlite> explain query plan select * from Disruptions  where status =2 
COLLATE NOCASE or status = 6 COLLATE NOCASE;

selectid|order|from|detail
0|0|0|SEARCH TABLE Disruptions USING INDEX Disruptions_idx4 (status=?)
0|0|0|EXECUTE LIST SUBQUERY 1
sqlite>

I'll check if we require all the fields, we require many (which I agree 
is not all) of the fields. Following this logic through, does this mean 
that it will do more file access bringing the records in from the file 
system?


The collation issue seems to be an artifact of the way Navcat for SQLite 
works. I suspect we need to be more careful about how we use the tool.


We'll now time the results of each query and run them twice to see the 
affect. No idea how long this will take but suspect a few hours :) I 
will post back the results as other people may (or may not) find this 
helpful.


Thanks

Rob

On 17 Mar 2017, at 10:57, Hick Gunter wrote:

On 17 Mar 2017, at 10:20am, Rob Willett 
 wrote:


CREATE INDEX "Disruptions_idx4" ON Disruptions ("status" COLLATE 
NOCASE ASC);


[…]

As part of the larger more complex query, we are executing the query

```
select * from Disruptions where status = 2 OR status = 6;


The schema for the table says that "status" is INTEGER.
You are supplying numbers as arguments.
Those two match and should create no problem.

But your index has a collation order which is usually used for text.  
I don’t see that it is obviously wrong, but it does look a little 
weird.


Try creating another index which is just on "status", without the 
COLLATE clause.

Then do another ANALYZE, then try the SELECT again.

Simon.


If the index is deemed unsuitable by SQLite due to its collation 
sequence, then I expect it qwould also be ignored in "select ... 
status=1" (without the second ORed value)


If not, then (select ... where status =2 UNION ALL select where status 
= 6) should do the trick


Do you really require all the fields from Disruptions?

And yes, collating integers with NOCASE seems quite strange (there are 
no capital or lowercase numbers unless you are using roman numerals ;) 
); for text affinity, it should render the comparison operators 
caseblind, just like "like".



___
 Gunter Hick
Software Engineer
Scientific Games International GmbH
FN 157284 a, HG Wien
Klitschgasse 2-4, A-1130 Vienna, Austria
Tel: +43 1 80100 0
E-Mail: h...@scigames.at

This communication (including any attachments) is intended for the use 
of the intended recipient(s) only and may contain information that is 
confidential, privileged or legally protected. Any unauthorized use or 
dissemination of this communication is strictly prohibited. If you 
have received this communication in error, please immediately notify 
the sender by return e-mail message and delete all copies of the 
original communication. Thank you for your cooperation.



___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Yet Another Why Doesn't Sqlite Use My Index question ...


Simon,

Thank you for the quick response. We'll do this. We're just waiting for 
another analyze to finish. Our current DB is 45GB, the query works OK on 
smaller databases, but at this size its very slow, thats why we have a 
bad feeling that we're tinkering on the edges and we need a new DB 
design :(


The reason for the collation is the way NavCat SQLite works, its 
difficult to get it without the collate but we'll do this direct from 
the command line.


We'll post back in an hour or so as it'll probably take that long to run 
:)


Rob

On 17 Mar 2017, at 10:27, Simon Slavin wrote:

On 17 Mar 2017, at 10:20am, Rob Willett  
wrote:


CREATE INDEX "Disruptions_idx4" ON Disruptions ("status" COLLATE 
NOCASE ASC);


[…]

As part of the larger more complex query, we are executing the query

```
select * from Disruptions where status = 2 OR status = 6;


The schema for the table says that "status" is INTEGER.
You are supplying numbers as arguments.
Those two match and should create no problem.

But your index has a collation order which is usually used for text.  
I don’t see that it is obviously wrong, but it does look a little 
weird.


Try creating another index which is just on "status", without the 
COLLATE clause.

Then do another ANALYZE, then try the SELECT again.

Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] Yet Another Why Doesn't Sqlite Use My Index question ...


Hi,

We've been struggling with a complex query that we have written. One of 
the elements of this complex query is a select statement that doesn't 
appear to use an index when we *think* it should do. We're not going to 
ask you to debug a large complex SQL query (unless you have nothing else 
to do today) but we're working our way through the query trying to 
understand where we've gone wrong. So we've broken down the query and 
are trying each section to see what it performs (or rather doesn't 
perform like).


The sub query is a simple select on a large table, Disruptions. The 
Disruptions table has 180M rows of data.


The schema for it is here. We've pulled it straight from Navicat for 
SQLite.


```
CREATE TABLE "Disruptions" (
 "id" INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT,
 "version" integer NOT NULL,
 "Disruption_id" INTEGER NOT NULL,
 "status" integer NOT NULL,
 "severity" integer NOT NULL,
 "levelOfInterest" integer NOT NULL,
 "category" integer NOT NULL,
 "subCategory" integer NOT NULL,
 "startTime" TEXT NOT NULL,
 "endTime" text NOT NULL,
 "location" integer NOT NULL,
 "corridor" integer NOT NULL,
 "comments" integer NOT NULL,
 "currentUpdate" integer NOT NULL,
 "remarkTime" TEXT NOT NULL,
 "lastModTime" TEXT NOT NULL,
 "CauseAreaPointX" real NOT NULL,
 "CauseAreaPointY" real NOT NULL,
 "Direction" TEXT
);
INSERT INTO "main".sqlite_sequence (name, seq) VALUES ("Disruptions", 
'184626834');


-- 
--  Indexes structure for table Disruptions
-- 
CREATE INDEX "Disruptions_Idx1" ON Disruptions ("location" COLLATE 
NOCASE ASC, "corridor" COLLATE NOCASE ASC, "status" COLLATE NOCASE ASC, 
"category" COLLATE NOCASE ASC, "severity" COLLATE NOCASE ASC, 
"levelOfInterest" COLLATE NOCASE ASC, "version" COLLATE NOCASE ASC, 
"subCategory" COLLATE NOCASE ASC, "startTime" COLLATE NOCASE ASC);
CREATE UNIQUE INDEX "Disruptions_Idx2" ON Disruptions ("Disruption_id" 
COLLATE NOCASE ASC, "version" COLLATE NOCASE ASC, "category" COLLATE 
NOCASE ASC, "subCategory" COLLATE NOCASE ASC);
CREATE UNIQUE INDEX "Disruptions_Idx3" ON Disruptions ("version" COLLATE 
NOCASE ASC, "Disruption_id" COLLATE NOCASE ASC, "location" COLLATE 
NOCASE ASC);
CREATE INDEX "Disruptions_idx4" ON Disruptions ("status" COLLATE NOCASE 
ASC);


PRAGMA foreign_keys = true;
```

As part of the larger more complex query, we are executing the query

```
select * from Disruptions where status = 2 OR status = 6;
```

Status is one of six values, 1 to 6 and is probably not evenly 
distributed across the 180M rows.


If we do

```
sqlite> explain query plan select * from Disruptions where status = 2 OR 
status = 6;

selectid|order|from|detail
0|0|0|SCAN TABLE Disruptions
```

We can see that table scanning a 180M records is going to be slow, no 
matter what the rest of the complex query is like.


We have an index Disruptions_idx4 which we *think* should speed it up, 
but the query plan doesn't seem to take this into account.


We think that only having six values of Status means that the speed up 
from the index is not going to be fantastic but every little helps.


We have run analyze on the database and that hasn't helped.

Our gut feeling at this moment is that we have the design structure 
wrong in our DB and we're going to have to take a long hard look at what 
we're doing, do a redesign and a rebuild as we simply got it wrong from 
the beginning. Hindsight is wonderful ;) In the interim (as this is a 
2-3 month job), we need to speed our query up from 90 mins down to 
something in the tens of mins.


Any suggestions very much welcomed,

Thanks

Rob
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] How to use parameterized queries in SQLite.Net

To answer my own question:  this works:

using (SQLiteCommand command = m_conn.CreateCommand())
{
command.CommandType = CommandType.Text;
command.CommandText = "INSERT INTO trend_data (tag_key, value, 
value_timestamp) VALUES (?, ?, ?)";
SQLiteParameter param;
param = new SQLiteParameter();
param.Value = 2;
command.Parameters.Add(param);
param = new SQLiteParameter();
param.Value = 234.56;
command.Parameters.Add(param);
param = new SQLiteParameter();
param.Value = DateTime.Now;
command.Parameters.Add(param);
rowsAffected = command.ExecuteNonQuery();
}

RobR

-Original Message-
From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On 
Behalf Of Rob Richardson
Sent: Monday, March 13, 2017 2:23 PM
To: General Discussion of SQLite Database (sqlite-users@mailinglists.sqlite.org)
Subject: [sqlite] How to use parameterized queries in SQLite.Net

Hello again.

Since my attempt to find the official answer for myself has hit a snag, I'll 
just ask here.

The examples I've seen for parameterized queries used with the SQLiteCommand 
class have shown named parameters, and the names usually begin with an "@" 
character.  Is that character required for named parameters?  Is that the 
correct leading character?  Is it required to include that leading character in 
the name given to the SQLiteParameter object?

I'm used to using the System.Data.ODBC classes, which do not support named 
parameters, but they do support unnamed parameters, represented by question 
marks.  The order in which the parameters are attached to the command object 
determines the association between the parameter object and the query 
parameter.  Unnamed parameters would be easier for me to work with than named 
ones.  Does SQlite.Net support unnamed parameters?

Thank you.

RobR


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Help file has no information

Thank you.  That worked.  (The button was labelled "unblock", not "unlock")

RobR

-Original Message-
From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On 
Behalf Of Graham Holden
Sent: Monday, March 13, 2017 2:26 PM
To: SQLite mailing list
Subject: Re: [sqlite] Help file has no information

Either don't store it on a network drive or right click, select properties and 
unlock (see 
https://social.technet.microsoft.com/Forums/en-US/20700886-2000-4c52-b706-aa1fb32d3dfb/cant-view-chm-file-contents?forum=W8ITProPreRel).
 I think.
Graham

Sent from my Samsung Galaxy S7 - powered by Three
 Original message From: Rob Richardson 
 Date: 13/03/2017  18:14  (GMT+00:00) To: "General 
Discussion of SQLite Database (sqlite-users@mailinglists.sqlite.org)" 
 Subject: [sqlite] Help file has no 
information The help file that is linked from the System.Data.SQLite home page 
appears to have a table of contents but no information.  No matter what page I 
select, the page does not appear.

RobR
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] How to use parameterized queries in SQLite.Net

Hello again.

Since my attempt to find the official answer for myself has hit a snag, I'll 
just ask here.

The examples I've seen for parameterized queries used with the SQLiteCommand 
class have shown named parameters, and the names usually begin with an "@" 
character.  Is that character required for named parameters?  Is that the 
correct leading character?  Is it required to include that leading character in 
the name given to the SQLiteParameter object?

I'm used to using the System.Data.ODBC classes, which do not support named 
parameters, but they do support unnamed parameters, represented by question 
marks.  The order in which the parameters are attached to the command object 
determines the association between the parameter object and the query 
parameter.  Unnamed parameters would be easier for me to work with than named 
ones.  Does SQlite.Net support unnamed parameters?

Thank you.

RobR


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] Help file has no information

The help file that is linked from the System.Data.SQLite home page appears to 
have a table of contents but no information.  No matter what page I select, the 
page does not appear.

RobR
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Where is the official document for System.Data.SQLite?

Blindness cured.  I had visited that page but missed the super-sized, bold 
headline that said "Documentation for System.Data.SQLite".  

Thanks for opening my eyes.

RobR

-Original Message-
From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On 
Behalf Of Donald Griggs
Sent: Monday, March 13, 2017 1:57 PM
To: SQLite mailing list
Subject: Re: [sqlite] Where is the official document for System.Data.SQLite?

On Mon, Mar 13, 2017 at 1:44 PM, Rob Richardson 
 wrote:

> Where is the official document for System.Data.SQLite?  And is there a 
> better list than this one to ask questions about System.Data.SQlite?


Hi Rob,

I don't use system.data.sqlite, but I believe you'll find the documentation
at:

http://system.data.sqlite.org

>
>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] Where is the official document for System.Data.SQLite?