from:"Rob Willett"

Re: [sqlite] Size limits

2020-01-07 Thread Rob Willett


Andy,

I can state that SQLite easily went up to 100GB with zero issues for us. 
We decided to change the structure of our database and we reduced our 
database size as we didn't want to incur extra VPN costs. We also 
realised we designed the database wrong and it had a lot of useless data 
in...but thats another different story :)


I have heard people talking about 300GB SQLite databases and I have no 
reason to doubt that SQLite would work. As Simon says, environmental 
issues may come into affect before you hit SQLite limits.


I can't comment on large blobs as not sure what large is, 1MB, 100MB, 
1GB?


Rob

On 7 Jan 2020, at 11:21, Simon Slavin wrote:


On 7 Jan 2020, at 10:29am, Andy  wrote:

What are reasonable limits for size Sqlite3 database file and large 
blobs?


Unfortunately, recommendations (rather than hard limits) are closely 
tied to your OS and hardware.  This is because the things you want to 
avoid are things like busting your cache, or memory faults, and the 
sizes of those are set by your own computer and OS.


Experience of this list has shown that we really can't make any firm 
limits without knowledge of your setup, and you'd have to find someone 
with an identical setup to have a useful conversation.  We can only 
urge you to test things out and see for yourself where your 
performance drops off.


In terms of hard limits, you might like to read this:


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] How to increase performance when inserting a lot of small data into table using indices

2019-09-10 Thread Rob Willett


Hartwig,

You have got most of the tricks we know about. Other more experienced 
developers may provide a better insight.


We had to moved about 60GB of table data about and we ended up doing 
what you have done with one extra bit, we batched the jobs up in 
multiples of 10,000 between BEGIN and END to make transactions out of 
them. It's not clear if you are doing that.


Rob

On 10 Sep 2019, at 16:02, mailing lists wrote:


I have the following situation:

- in one table relatively small data is inserted (100 bytes per 
record)

- this table contains three indices
- about 100 million or more records have to be inserted

Insertion really slows down after about 100 000 items have been 
inserted. I suppose that the slow down is related to indexing because:


a) removing the indices brings the speed up
b) it does not matter whether using a solid state drive or a 
conventional one (the overall speed differs but not the phenomenon 
itself)

c) changing the cache size has only a minor impact

So, the best solution I found so far is to disable indexing while 
insertion and to index the table afterwards (this is magnitudes faster 
than insertion with indexes). Are there any better solutions or other 
tricks I might try (splitting table into a data and an index part)?


BTW: I am using journal_mode DELETE. WAL mode only delays the problem 
and increases a bit the speed but not significantly.


Regards,
Hartwig



___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] How to create a new table or overwrite an existing table in one command?

2019-09-04 Thread Rob Willett


Ingo,

I can't answer that as we have everything in one file. I suspect that in 
hindsight, putting a large table in a separate file would have been 
advantageous. However the one of cost of dropping a 59GB table has gone 
and our daily pruning and vacuuming  of the table is a few seconds. 
Hindsight is great :)


I brought this up as it was a major issue for us at the time and we 
wanted other people to be aware that deleting a table is SQLite is not 
as 'cheap' as other systems. This is the first time we have found SQLite 
to not be as good as anything else :) Please note that this is not meant 
to be criticism of SQLite but rather one of the tradeoffs we know about 
about and make. We win for some many other things that we have no 
issues.


Rob

On 4 Sep 2019, at 12:02, ingo wrote:


On 4-9-2019 12:24, Rob Willett wrote:

Peng,

Dropping very large tables is time consuming. Dropping a 59GB table
takes quite a long time for us even on fast hardware. Dropping 
smaller

tables is faster though.



When using (and dropping) this big tables, would it be of advantage to
put only that one table in a separate database and attach it when
needed. There would be no need then to drop it, one could just detach
and delete the db.

Ingo
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] How to create a new table or overwrite an existing table in one command?

2019-09-04 Thread Rob Willett


Peng,

Dropping very large tables is time consuming. Dropping a 59GB table 
takes quite a long time for us even on fast hardware. Dropping smaller 
tables is faster though.


Not sure what size tables you have but something to think about. We 
experimented with new tables and changing old tables and for us, 
dropping the table and creating a new one was the fastest method but 
still took 40-60 mins from memory.


Rob

On 4 Sep 2019, at 11:18, Peng Yu wrote:

For now, I just delete the db file if it exists already. So that I 
don’t
need to call DROP TABLE IF EXISTS mytab. I guess directly deleting the 
db

file can be faster than the latter. Is it so?

On Tue, Sep 3, 2019 at 3:06 PM Simon Slavin  
wrote:



On 3 Sep 2019, at 8:57pm, Peng Yu  wrote:

If I try to create a table that already exists, sqlite will give me 
an
error. Is there way to issue one command to create a table, but if 
there is
already a table with the same name, drop it then create the new 
table?

Thanks.

Assuming that the new table has a different structure to the old one, 
do

it in two commands:

DROP TABLE IF EXISTS MyTable;
CREATE TABLE MyTable ...;

Neither of those commands will generate an error.

If you are sure that the new table has the same structure as the old 
one,

you can do

CREATE TABLE IF NOT EXISTS MyTable ...;
DELETE FROM MyTable;
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


--
Regards,
Peng
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Tracking item history using SQLite

2019-09-02 Thread Rob Willett


Hi,

We aren't storing first_seen in every row. Each incident is something 
like


Date_Time_of_update_in_epoch_secs1 unique_incident_number information about the incident>
Date_Time_of_update_in_epoch_secs2 unique_incident_number information about the incident>
Date_Time_of_update_in_epoch_secs3 unique_incident_number information about the incident>
Date_Time_of_update_in_epoch_secs4 unique_incident_number information about the incident>
Date_Time_of_update_in_epoch_secs5 unique_incident_number information about the incident>


These incidents are spread over time, but normally there are around 
300-600 secs apart. There is no guarantee that each update has an update 
for that unique_incident_number in, so we could have an update with an 
incident missing, and then one 10 mins later with the incident in. The 
data feed is wholly out of our control and it may be that a traffic 
officer has not been able to add things in. We have to manage data being 
incomplete but we've learnt how to now.


What we do each night is work out the stuff in the middle of the first 
and last incident (based on the unique_incident_number) and throw the 
middle stuff away. There are other heuristics that determine what we 
throw away, so we built a small state machine to work it all out. The 
first time we ran the system, it took about three hours to run and work 
out what to throw. Thats when we found out that dropping a table in 
SQLite is quite time expensive. However once you do the first run, 
subsequent runs take around 90 secs. This means that the report we run 
to work out the averages of incidents dropped from 90 mins to around 90 
secs.


Most of our stuff is written in the worlds most hated language, Perl, 
which we find easy to use and fast. No idea why people don't like it ;)


We can't do this in real time as it's too much work, so we run it at 
03:35 each day on a cronjob.


Hope this helps.

Rob

On 31 Aug 2019, at 14:45, Rob Richardson wrote:

Why are you storing first_seen in every record?  To avoid searching 
for it

when reports are generated?

On Sat, Aug 31, 2019 at 6:24 AM Rob Willett 


wrote:


Hi,

We have a very similar system that captures traffic incident 
information

such as accidents, roadworks, traffic jams and sends personalised
information to drivers based on their routes.

We realised after a few years that our original design for the 
database
was inefficient as we had a single table that was about 60GB, circa 
500M
rows, that consisted of traffic incidents and uptakes. The 
inefficiency

was our design and not SQLite.

This inefficiency showed when we used to run a daily report that
summarised the traffic information for historical trends, e.g. we 
wanted
to know the average time for a road accident to be cleared on a 
stretch

of road on a specific day or time of day. This report mean we read in
circa 500M rows and 60GB of data each night and took circa 90 mins to
run as we did a lot of calculations.

We were not able to change the structure of the database too much (as 
it
was too far down the line). So we spent a long time looking at the 
data

with cold, wet towels on our foreheads in darkened rooms :) After we
analysed the data we realised that most of our 'updates' were not
actually updates at all an all were were actually interested in was 
the
start and end times of incidents, all the 'stuff' in the middle was 
just
noise.  The problem we had was that disruptions could last a long 
time
and there was no actual end of incident marker, just that there 
wasn't

another disruption with that unique identifier and a later timestamp.
This sounds similar to your situation. Basically we constructed a 
simple

state model based on time.

What we now do is to each night we run a Perl script each night to
remove anything that is between the first and end incidents. We know
that the end incident may move on the next day, but at that point in
time it is still the last/end/most recent indicator. Our data is
structured around time so we always have a time of insertion 
indicator

in the field as things may not be ordered by row id.

Once we had this insight, we took our database down from 60GB to 
800MB
and the report that ran from 90 mins to 90 secs. We have to give 
credit
to this mailing list for the patient and courteous way that helped 
and

assisted with our often stupid questions.

The lessons we learnt from this are:

1. Get your database schema right from the go. We didn't. In 
hindsight

we should have spent longer looking at this.

2. Put in more information than you think you may need just in case. 
We

did that. We put time information in at per second granularity.

3. Don't worry too much about disk space unless you are on a 
constrained

device. In hindsight we could have stored even more data :)

4. Think hard about solutions that are simple. We love simplicity as
this makes it easier to go back and change things. Too many 
dependencies
would be too difficult to

Re: [sqlite] Tracking item history using SQLite

2019-08-31 Thread Rob Willett


Hi,

We have a very similar system that captures traffic incident information 
such as accidents, roadworks, traffic jams and sends personalised 
information to drivers based on their routes.


We realised after a few years that our original design for the database 
was inefficient as we had a single table that was about 60GB, circa 500M 
rows, that consisted of traffic incidents and uptakes. The inefficiency 
was our design and not SQLite.


This inefficiency showed when we used to run a daily report that 
summarised the traffic information for historical trends, e.g. we wanted 
to know the average time for a road accident to be cleared on a stretch 
of road on a specific day or time of day. This report mean we read in 
circa 500M rows and 60GB of data each night and took circa 90 mins to 
run as we did a lot of calculations.


We were not able to change the structure of the database too much (as it 
was too far down the line). So we spent a long time looking at the data 
with cold, wet towels on our foreheads in darkened rooms :) After we 
analysed the data we realised that most of our 'updates' were not 
actually updates at all an all were were actually interested in was the 
start and end times of incidents, all the 'stuff' in the middle was just 
noise.  The problem we had was that disruptions could last a long time 
and there was no actual end of incident marker, just that there wasn't 
another disruption with that unique identifier and a later timestamp. 
This sounds similar to your situation. Basically we constructed a simple 
state model based on time.


What we now do is to each night we run a Perl script each night to 
remove anything that is between the first and end incidents. We know 
that the end incident may move on the next day, but at that point in 
time it is still the last/end/most recent indicator. Our data is 
structured around time so we always have a time of insertion indicator 
in the field as things may not be ordered by row id.


Once we had this insight, we took our database down from 60GB to 800MB 
and the report that ran from 90 mins to 90 secs. We have to give credit 
to this mailing list for the patient and courteous way that helped and 
assisted with our often stupid questions.


The lessons we learnt from this are:

1. Get your database schema right from the go. We didn't. In hindsight 
we should have spent longer looking at this.


2. Put in more information than you think you may need just in case. We 
did that. We put time information in at per second granularity.


3. Don't worry too much about disk space unless you are on a constrained 
device. In hindsight we could have stored even more data :)


4. Think hard about solutions that are simple. We love simplicity as 
this makes it easier to go back and change things. Too many dependencies 
would be too difficult to unpick. Keeping it simple also works for us as 
we're not experts :)


5. SQLite seems to be able to do anything we want it to. We know the 
'limitations' of it not being a full client/server database, just to be 
clear, we do not consider this a limitation at all but rather a virtue. 
Other people seem worried about the 'lack' of some datatypes, we do 
masses of data and date conversations as needed and it's never been a 
speed issue or any issue.


6. Ask this group for help. We realised that the more information we 
provided in a clear and concise manner in our help email, the more help 
we got, it was a virtuous circle. The sum help of the people in this 
group far, far exceeds our own knowledge. As a rule of thumb, if we ask 
for help, we expect it to take 1-2 hours as a minimum for us to write 
the email. That may be because we're not SQL experts. Also follow up 
every email as if people have taken the time to reply to us, they 
deserve a reply themselves. As we are English, we could end up in a 
thanks for the reply endless loop :) The group has an exceptionally 
signal to noise ratio and is invariably courteous.


I would be astonished if you can't get SQLite to do what you want to do. 
We have never managed to touch the sides of the system and suspect we 
never will.


More than happy to answer more questions as this group helped us and 
it's only fair we offer the help back. I will state that we are not SQL 
(or SQLite) experts :)


Rob

On 30 Aug 2019, at 21:44, Random Coder wrote:


First off, if this sort of "code review" style question is
inappropriate for this list, please feel free to reply to me directly
and tell me to stop, and I'll know to not do this again.

That said, I have a question on the use of SQLite.  At the risk of
falling into the XY problem, I'll give some details on what i'm doing,
and then how I'm doing it.  My basic question is if what I'm doing is
valid, and if I'm doing anything needlessly wasteful.

I have a system monitoring events (and I know I'm being vague on the
exact nature of these events, I can't provide too many details, sorry.
Please try to accept what I say

Re: [sqlite] Why no persistent user settings in a database file?

2019-04-03 Thread Rob Willett


Tom,

Because the file is the database, your preferences for how things are 
displayed are just that, your preferences. Don't combine the two. If you 
have two users with different preferences, who wins?


We move the SQLite database around all the time, across different 
servers. We do not want the preferences for one to affect another 
server.


Rob

On 3 Apr 2019, at 11:52, Tom Browder wrote:


After coming back to SQLite from a long absence, I was surprised that
setting things like .mode and .headers in a database didn't stay that 
way

after exiting the file.

Then I remembered something about a resource file and found .sqliterc 
on an

internet search and that allowed the persistent settings I wanted.

However, would it not be more natural to keep those settings 
persistent

inside the db file?

Thanks,

-Tom
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Can I get help with db design for SQLite use?

2019-04-03 Thread Rob Willett


Shawn,

I will investigate File::Slurper. Rather than pollute this list, send me 
a mail on

rob.sql...@robertwillett.com about the issues you had.

Thanks

Rob

On 3 Apr 2019, at 8:50, Shawn Wagner wrote:


Yay Perl! My favorite language. DBD::SQLite is definitely one of the
better, fuller featured sqlite bindings out there. Though Tom is using
perl6, not perl5. I have no idea how its version compares.

(At the risk of going off topic, File::Slurp has issues. File::Slurper 
is a

better alternative.)

On Wed, Apr 3, 2019, 12:08 AM Rob Willett 


wrote:


Tom,

We use the Perl DB::SQLite module. It works very well and I cannot
recall a single issue with it in the last four years. There's not as
much support for Perl on this mailing list as it's not as popular, 
but

most issues you will probably encounter will be at the design level
rather than at the CPAN module level. Our working assumption is that 
if
the DBI module looks like it has an issue, it doesn't and it's our 
code.

Pretty much the same as for SQLite. Whilst I know that SQLite has had
bugs, the chances of us finding them is minimal, so we assume it's 
our

code again.

However I can say that we use Perl for all our system code and use
SQLite within it and its fast and easy.

If you're using Perl, I would also recommend the following other
modules, this is a direct pull from our code. I've removed our code
specific modules.

```
#!/usr/bin/perl -w

use strict;
use warnings;

use Switch;
use DBI;
use JSON;
use Getopt::Long;  <-- Easiest way to get command line args in 
and

processed.
use Data::Dumper;  <-- Utterly essential, don't leave home 
without

it.
use Mojolicious::Lite; <-- Only need if you making a REST based 
server
use Mojo::Parameters;  <-- Only need if you making a REST based 
server
use Mojo::URL; <-- Only need if you making a REST based 
server
use Mojo::Log; <-- Only need if you making a REST based 
server

use REST::Client; <-- More rest based stuff
use Sereal::Encoder qw(encode_sereal sereal_encode_with_object);
use Sereal::Decoder qw(decode_sereal sereal_decode_with_object
scalar_looks_like_sereal);
use DateTime;  <-- Manage date and time properly.
use Net::Curl::Easy;
use Crypt::Random qw( makerandom );
use Log::Log4perl qw(get_logger :levels);
use File::Path qw(make_path); <-- Quick and easy way to make paths 
and

directories.
use Net::Address::IP::Local;  <-- Easy way to manipulate IP 
addresses.

use File::Slurp; <-- Quick and easy way to read and write files.
use Clone 'clone'; <-- You'll use a lot of objects and structures. 
Copy

them properly and fast.

```

We use Log4Perl a lot as it's easy to get things setup and then you 
can

modify one log file and get easy changes. Also we use Mojolicious for
all the REST code wrapping.

These are tried and tested CPAN modules that we know just work for us
and are a standard part of every new build. The most awkward one if 
the

curl one, there seems to be a lot of versions of the Curl::Easy stuff
which simply don't work on Mac OS X (or other OS's).

Just my 2p worth.

Rob

On 3 Apr 2019, at 0:21, Tom Browder wrote:


On Tue, Apr 2, 2019 at 17:30 am...@juno.com  wrote:


You might want to import everything into SQLite Studio or SQLite
Suite I
forget the exact name)--a freebie on the internet. I found it 
worked

for
me. In order to help others, I would appreciate it if you tell us 
on

this
usergroup how you made out. May it work for you. Peace! Alex



Thanks for the idea, Alex.

I'm doing all programmatically at the moment (using a Perl 6 module:
DB::SQLite), but using one of those tools you mentioned would help 
in

design for sure!

-Tom
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Can I get help with db design for SQLite use?

2019-04-03 Thread Rob Willett


Tom,

We use the Perl DB::SQLite module. It works very well and I cannot 
recall a single issue with it in the last four years. There's not as 
much support for Perl on this mailing list as it's not as popular, but 
most issues you will probably encounter will be at the design level 
rather than at the CPAN module level. Our working assumption is that if 
the DBI module looks like it has an issue, it doesn't and it's our code. 
Pretty much the same as for SQLite. Whilst I know that SQLite has had 
bugs, the chances of us finding them is minimal, so we assume it's our 
code again.


However I can say that we use Perl for all our system code and use 
SQLite within it and its fast and easy.


If you're using Perl, I would also recommend the following other 
modules, this is a direct pull from our code. I've removed our code 
specific modules.


```
#!/usr/bin/perl -w

use strict;
use warnings;

use Switch;
use DBI;
use JSON;
use Getopt::Long;  <-- Easiest way to get command line args in and 
processed.
use Data::Dumper;  <-- Utterly essential, don't leave home without 
it.

use Mojolicious::Lite; <-- Only need if you making a REST based server
use Mojo::Parameters;  <-- Only need if you making a REST based server
use Mojo::URL; <-- Only need if you making a REST based server
use Mojo::Log; <-- Only need if you making a REST based server
use REST::Client; <-- More rest based stuff
use Sereal::Encoder qw(encode_sereal sereal_encode_with_object);
use Sereal::Decoder qw(decode_sereal sereal_decode_with_object 
scalar_looks_like_sereal);

use DateTime;  <-- Manage date and time properly.
use Net::Curl::Easy;
use Crypt::Random qw( makerandom );
use Log::Log4perl qw(get_logger :levels);
use File::Path qw(make_path); <-- Quick and easy way to make paths and 
directories.

use Net::Address::IP::Local;  <-- Easy way to manipulate IP addresses.
use File::Slurp; <-- Quick and easy way to read and write files.
use Clone 'clone'; <-- You'll use a lot of objects and structures. Copy 
them properly and fast.


```

We use Log4Perl a lot as it's easy to get things setup and then you can 
modify one log file and get easy changes. Also we use Mojolicious for 
all the REST code wrapping.


These are tried and tested CPAN modules that we know just work for us 
and are a standard part of every new build. The most awkward one if the 
curl one, there seems to be a lot of versions of the Curl::Easy stuff 
which simply don't work on Mac OS X (or other OS's).


Just my 2p worth.

Rob

On 3 Apr 2019, at 0:21, Tom Browder wrote:


On Tue, Apr 2, 2019 at 17:30 am...@juno.com  wrote:

You might want to import everything into SQLite Studio or SQLite 
Suite I
forget the exact name)--a freebie on the internet. I found it worked 
for
me. In order to help others, I would appreciate it if you tell us on 
this

usergroup how you made out. May it work for you. Peace! Alex



Thanks for the idea, Alex.

I'm doing all programmatically at the moment (using a Perl 6 module:
DB::SQLite), but using one of those tools you mentioned would help in
design for sure!

-Tom
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] SQLite slow when lots of tables

2019-01-29 Thread Rob Willett

Millions of rows is not even large, never mind huge or very huge :)

We have tables with hundreds of millions of rows, we got to billions of 
rows in a single table before we changed the logic. From memory we had 
67GB for a single database and I reckon 60GB was one table. Not many 
issues at all with inserting or searching. One of our data mining 
queries searched the entire table and it still only took 90 secs, though 
all of the query used indexes.

We only changed what we store as the management of a circa 60GB database 
was too much and we worked out we only needed 1% of it. We were using a 
virtual private server and we had issues with disk IO when we copied the 
database around using Unix cp. This wasn't a SQLite problem at all. 
However I have no doubt that SQLite was more than capable of handling 
even more data.

Rob

On 29 Jan 2019, at 11:00, mzz...@libero.it wrote:

Dear all,

what happens if I put all data in a single table and this table become 
very huge (for example millions of rows)?

Will I have same performace problems?

Thanks.

Regards.

Il 28 gennaio 2019 alle 17.28 Simon Slavin  
ha scritto:

On 28 Jan 2019, at 4:17pm, mzz...@libero.it wrote:

> >
when the number of the tables become huge (about 15000/2 
tables) the first DataBase reading query, after Database open, is 
very slow (about 4sec.) while next reading operations are faster.

How can I speed up?

>

Put all the data in the same table.

At the moment, you pick a new table name each time you write 
another set of data to the database. Instead of that, create just one 
big table, and add an extra column to the columns which already exist 
called "dataset". In that you put the string you previously used as 
the table name.

SQL is not designed to have a variable number of tables in a 
database. All the optimization is done assuming that you will have a 
low number of tables, and rarely create or drop tables.

Simon.

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Mac: Users receive "database disk image is malformed" errors after restoring database from Time Machine backup

2018-12-12 Thread Rob Willett

Whilst Time Machine does not do snapshots how enterprise storage do 
snapshots, literally a freeze and recovery point. Time Machine does make 
backups suitable for booting from. Apple considers Time Machine suitable 
for home use backups.


You can backup with TimeMachine and boot from it. My personal experience 
of TM is that its flaky and unreliable and we use Carbon Copy Cloner AND 
Chronosynd to get to backups that I trust to work. YMMV


However we are going off the point from the OP. I personally think TM is 
to blame and would advice the users to check the TM setup and experiment 
with delaying the WAL backup just to see what would happen.


Rob

On 12 Dec 2018, at 16:58, Peter da Silva wrote:

Apple uses Sqlite in a number of applications, including Apple Mail, 
so
they have to have some kind of accommodation for saving sqlite 
databases.


The Time Machine patent does not describe using file system snapshots:


*"An algorithm or other monitoring can be used to detect changes that 
occur

during the backup operation in order to maintain consistency between
related data in the backup. The back up can be performed again for 
related

data that was modified during prior backup operation. *

*"In general, in one aspect, a method is provided. A backup operation 
of
data including a plurality of related items is initiated. 
Modifications to

one or more items of the plurality of related items are monitored for
during the backup operation. The backup operation is completed. If a
modification occurred to one or more items, a second backup operation 
is

performed for the modified items."*

This does not seem to authoritatively state that multiple files will 
be

backed up consistently.

On Wed, Dec 12, 2018 at 9:06 AM Keith Medcalf  
wrote:




I know nothing about "Time Machine", but does it copy the entire
filesystem in (at least) "crash consistent" state?

---
The fact that there's a Highway to Hell but only a Stairway to Heaven 
says

a lot about anticipated traffic volume.



-Original Message-
From: sqlite-users [mailto:sqlite-users-
boun...@mailinglists.sqlite.org] On Behalf Of Daniel Alm
Sent: Tuesday, 11 December, 2018 05:02
To: sqlite-users@mailinglists.sqlite.org
Subject: [sqlite] Mac: Users receive "database disk image is
malformed" errors after restoring database from Time Machine backup

Hi,

For the past half year we’ve been receiving reports from users who
had restored their SQLite-based databases from a Time Machine 
backup.

Afterwards, they would receive "database disk image is malformed”
errors. The app also backs up the user’s data “manually” to a 
ZIP

file every week; those backups seem to be working fine. We also
haven’t received reports from other backup tools causing issues. I
have also suspected a bug in Time Machine, but it is striking that
the issues did seem to start occurring after an update to the app
(luckily, in fact, with the same update that also introduced the
“manual” backups).

Changes that we made to our setup in the update that coincided with
the errors occurring:
- Upgraded SQLite from 3.21 to 3.24 (we have since reverted to 
3.23.1

in another update; no improvement).
- Used memory mapping for read accesses via “PRAGMA mmap_size =
1073741824;” (we have since reverted to “PRAGMA mmap_size = 
0;” after

reading http://sqlite.1065341.n5.nabble.com/Re-Database-corruption-
and-PRAGMA-fullfsync-on-macOS-td95366.html
; no improvement).
- Using a secondary database via [ATTACH
DATABASE](https://www.sqlite.org/lang_attach.html
) (although this also seems
to occur for users without such a database).

At this point, I am at a loss, especially given that SQLite should 
be

fairly robust against database corruption. While our app is running
in the background all the time, it is not very write-heavy (~ one
transaction per minute taking just a few milliseconds). Also, the 
app

had been running fine before the update for a long time without any
reports of this issue. I might be doing something wrong or have
changed anything else, but I don’t know what; if you have any 
ideas,

let me know.

Any suggestions on what could be the culprit or what else I could 
try

besides downgrading all the way to SQLite 3.21 would be appreciated.

Thanks,
Daniel Alm

P.S.: Our database currently uses the following PRAGMAs:

PRAGMA mmap_size = 0;
PRAGMA page_size = 4096;
PRAGMA cache_size = -10240;
PRAGMA foreign_keys = ON;
PRAGMA journal_size_limit = 8388608;
PRAGMA checkpoint_fullfsync = 1;
PRAGMA wal_autocheckpoint = 2048;
PRAGMA journal_mode = WAL;

Happy to provide any more details as needed.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users




___
sqlite-users mailing

Re: [sqlite] Mac: Users receive "database disk image is malformed" errors after restoring database from Time Machine backup

2018-12-12 Thread Rob Willett


Also are they using Time Machine on a networked drive?

Whilst Time Machine was not supposed to work across networks, people 
have made it work using 3rd party software. I know because we tried it 
for a laugh and abandoned it (and Time Machine) as it was wholly 
unreliable.


However I think the WAL commit (or uncommit) is a more likely scenario.

Rob

On 12 Dec 2018, at 15:00, R Smith wrote:


On 2018/12/12 4:48 PM, Richard Hipp wrote:

On 12/11/18, Daniel Alm  wrote:
Any suggestions on what could be the culprit or what else I could 
try

besides downgrading all the way to SQLite 3.21 would be appreciated.


Nothing about SQLite has changed that should make a difference here.

Do you know if the corruption is occurring when TimeMachine makes its
backup, or is occurring when the backed up database is restored?  Can
you capture some unrestored TimeMachine backups to see if they are
corrupt?


My best guess here is that TimeMachine somehow captures the sqlite DB 
files a few milliseconds apart "sometimes" so that a journal file that 
has just been committed is in the TimeMachine backup captured still in 
its uncommitted state while the DB itself is in the committed state 
already (or perhaps vice-versa).


This theory however makes no sense unless either the journal mode in 
SQLite has changed (either via update or pragma change by user), or 
TimeMachine itself has changed its ways since when things used to work 
correctly.




___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] [EXTERNAL] Re: Very, very slow commits - Possibly solved

2018-07-31 Thread Rob Willett


David,

We've been through that level of detail. Thats how we found that after 
an Analyse that a new automatic covering index was being created for a 
query that was not needed before the Analyse. That puzzled us and still 
does.


Rob

On 31 Jul 2018, at 16:58, David Raymond wrote:

Take a look at the queries being run and do an "explain query plan" 
for each and look at the output to see what it decides to use. Look 
for "SCAN TABLE" cases that might benefit, or other oddities where 
it's doing anything you're not expecting.



-Original Message-
From: sqlite-users 
[mailto:sqlite-users-boun...@mailinglists.sqlite.org] On Behalf Of Rob 
Willett

Sent: Tuesday, July 31, 2018 11:42 AM
To: SQLite mailing list
Subject: Re: [sqlite] [EXTERNAL] Re: Very, very slow commits - 
Possibly solved


Gunter,

Thanks for this. We have already started on this as we realised that 
the
COLLATE NOCASE was irrelevant and actually slowed down integer 
queries.


What we have not done is reorder the table to match the index queries.
This had not occurred to us.

We think we have already created the indexes so that we 'cover' the
queries. One advantage (!) of a 200M row table is that inefficient
queries are immediately obvious as we have to do a table scan.

We have not checked that the order of columns in the index match the
ORDER BY clauses. We never thought of that either,

Thanks for the help.

Rob

On 31 Jul 2018, at 16:21, Hick Gunter wrote:


Based on the currently available information I woudl suggest the
following schema:

CREATE TABLE IF NOT EXISTS "Disruptions" (
 "id" INTEGER PRIMARY KEY NOT NULL AUTOINCREMENT,
 "version" integer NOT NULL,
 "Disruption_id" INTEGER NOT NULL,
 "status" integer NOT NULL,
 "category" integer NOT NULL,
 "subCategory" integer NOT NULL,
 "location" integer NOT NULL,
 "levelOfInterest" integer NOT NULL,
 "corridor" integer NOT NULL,
 "severity" integer NOT NULL,
 "startTime" TEXT NOT NULL COLLATE NOCASE,
 "comments" integer NOT NULL,
 "currentUpdate" integer NOT NULL,
 "CauseAreaPointX" real NOT NULL,
 "CauseAreaPointY" real NOT NULL,
 "endTime" text NOT NULL COLLATE NOCASE,
 "remarkTime" TEXT NOT NULL COLLATE NOCASE,
 "lastModTime" TEXT NOT NULL COLLATE NOCASE,
 "Direction" TEXT COLLATE NOCASE
);

CREATE INDEX "Disruptions_Idx1" ON Disruptions
("location","corridor","status","category","severity","levelOfInterest","version","subCategory","startTime");

CREATE UNIQUE INDEX "Disruptions_Idx2" ON Disruptions
("Disruption_id","version","category","subCategory");

CREATE UNIQUE INDEX "Disruptions_Idx3" ON Disruptions
("version","Disruption_id","location");

CREATE INDEX Disruptions_Idx5 ON Disruptions
("status","Disruption_id","Severity","levelOfInterest","category","subCategory","version");

Remarks:

COLLATE is only required for text values. If you always store data of
the declared type, COLLATE has no business with a non-text column.

I have reordered the fields so that fields used in one or more 
indexes

are at the front of the record. This allows SQLite to quit decoding
the record faster when building index strings. I have assumed that
each index is used/updated equally often; if you have an index that 
is

used most, reordering the fields may help processing speed.

Non-index fields should be ordered so that fields that feature
prominently in the retrieved data come first; again, this will allow
SQLite to quit decoding the record earlier when reading data.

It may also improve performance to create a "covering index" by 
adding

the (one or two) fields retrieved to the index used for locating the
record. This allows SQLite to retrieve these fields directly from the
index BTree without referring back to the table BTree.

I assume the order of the fields of each index matches the order of
the fields in the ORDER BY clause(s) of the queries that use the
respective index.

-Ursprüngliche Nachricht-
Von: sqlite-users
[mailto:sqlite-users-boun...@mailinglists.sqlite.org] Im Auftrag von
Rob Willett
Gesendet: Dienstag, 31. Juli 2018 16:31
An: SQLite mailing list 
Betreff: [EXTERNAL] Re: [sqlite] Very, very slow commits - Possibly
solved

Dear all,

We think we have now found the issue with the slow commits.

We believe this is due to an inherent (and old) defect in our 
database

design. We think our original design has an implicit ordering

Re: [sqlite] [EXTERNAL] Re: Very, very slow commits - Possibly solved

2018-07-31 Thread Rob Willett

Gunter,

Thanks for this. We have already started on this as we realised that the
COLLATE NOCASE was irrelevant and actually slowed down integer queries.

What we have not done is reorder the table to match the index queries.
This had not occurred to us.

We think we have already created the indexes so that we 'cover' the
queries. One advantage (!) of a 200M row table is that inefficient
queries are immediately obvious as we have to do a table scan.

We have not checked that the order of columns in the index match the
ORDER BY clauses. We never thought of that either,

Thanks for the help.

Rob

On 31 Jul 2018, at 16:21, Hick Gunter wrote:

Based on the currently available information I woudl suggest the
following schema:

CREATE TABLE IF NOT EXISTS "Disruptions" (
"id" INTEGER PRIMARY KEY NOT NULL AUTOINCREMENT,
"version" integer NOT NULL,
"Disruption_id" INTEGER NOT NULL,
"status" integer NOT NULL,
"category" integer NOT NULL,
"subCategory" integer NOT NULL,
"location" integer NOT NULL,
"levelOfInterest" integer NOT NULL,
"corridor" integer NOT NULL,
"severity" integer NOT NULL,
"startTime" TEXT NOT NULL COLLATE NOCASE,
"comments" integer NOT NULL,
"currentUpdate" integer NOT NULL,
"CauseAreaPointX" real NOT NULL,
"CauseAreaPointY" real NOT NULL,
"endTime" text NOT NULL COLLATE NOCASE,
"remarkTime" TEXT NOT NULL COLLATE NOCASE,
"lastModTime" TEXT NOT NULL COLLATE NOCASE,
"Direction" TEXT COLLATE NOCASE
);

CREATE INDEX "Disruptions_Idx1" ON Disruptions
("location","corridor","status","category","severity","levelOfInterest","version","subCategory","startTime");

CREATE UNIQUE INDEX "Disruptions_Idx2" ON Disruptions
("Disruption_id","version","category","subCategory");

CREATE UNIQUE INDEX "Disruptions_Idx3" ON Disruptions
("version","Disruption_id","location");

CREATE INDEX Disruptions_Idx5 ON Disruptions
("status","Disruption_id","Severity","levelOfInterest","category","subCategory","version");

Remarks:

COLLATE is only required for text values. If you always store data of
the declared type, COLLATE has no business with a non-text column.

I have reordered the fields so that fields used in one or more indexes
are at the front of the record. This allows SQLite to quit decoding
the record faster when building index strings. I have assumed that
each index is used/updated equally often; if you have an index that is
used most, reordering the fields may help processing speed.

Non-index fields should be ordered so that fields that feature
prominently in the retrieved data come first; again, this will allow
SQLite to quit decoding the record earlier when reading data.

It may also improve performance to create a "covering index" by adding
the (one or two) fields retrieved to the index used for locating the
record. This allows SQLite to retrieve these fields directly from the
index BTree without referring back to the table BTree.

I assume the order of the fields of each index matches the order of
the fields in the ORDER BY clause(s) of the queries that use the
respective index.

-Ursprüngliche Nachricht-
Von: sqlite-users
[mailto:sqlite-users-boun...@mailinglists.sqlite.org] Im Auftrag von
Rob Willett

Gesendet: Dienstag, 31. Juli 2018 16:31
An: SQLite mailing list
Betreff: [EXTERNAL] Re: [sqlite] Very, very slow commits - Possibly
solved

Dear all,

We think we have now found the issue with the slow commits.

We believe this is due to an inherent (and old) defect in our database
design. We think our original design has an implicit ordering of rows
in a table, when the table is only increasing this flaw in the design
isn't apparent.

However when we started deduping the table AND we copied rows from one
table to another to move things around, we changed the underlying
order of rows. Sqlite handles the design change BUT the flaw in our
design becomes apparent as we keep moving the data around and data
gets mixed up. The database slows down when we create a second table
with an identical structure to the first table, copy the data into the
new table, drop the old and then when we rename the old table to the
new table, things appear to slow down. Logically speaking SQLite
shouldn't notice the difference in row order, but things do slow down,
even with analyse.

We think that a better index definition could solve the

Re: [sqlite] Very, very slow commits

2018-07-31 Thread Rob Willett


Simon,

Absolutely no need to apologise. We should apologise for all the time we 
have taken from other people :(


We recognise that the collate no case is inappropriate for our database. 
We suspect this was added from a SQLite tool we used some time ago. We 
are going to use this opportunity to remove this sort of nonsense.


We still think we have an inherent design fault in our database that we 
are trying to understand. One of the problems we had was that checking 
the performance of a 200M row table for bad indexes is time consuming. 
We now have a workflow to get from a 60GB database to a 600MB database 
in a few hours. We cannot do all the work in SQL as it involves an 
external program to analyse the data but a few hours to run isn't bad. 
As we now have the database held locally, we can thrash the local server 
silly to get the performance we need.


Rob

On 31 Jul 2018, at 16:18, Simon Slavin wrote:

On 31 Jul 2018, at 2:59pm, Rob Willett  
wrote:


We've created a new table based on your ideas, moved the collate into 
the table, analysed the database. We did **not** add COLLATE NOCASE 
to the columns which are defined as integers. Would that make a 
difference?


What you did is correct.  I gave wrong advice for which I apologise.  
But I am now confused since your original code is a little strange.  
Your original has a table definition including


"version" integer NOT NULL,

but then

	CREATE UNIQUE INDEX "Disruptions_Idx2" ON Disruptions 
("Disruption_id" COLLATE NOCASE ASC, "version" COLLATE NOCASE ASC, 
"category" COLLATE NOCASE ASC, "subCategory" COLLATE NOCASE ASC);


If "version" really is an INTEGER, then it is incorrect to use version 
COLLATE NOCASE in the index.  NOCASE is purely for text values.  This 
may be slowing things down.


To solve it, in the table definition, use COLLATE NOCASE for TEXT 
columns and not for INTEGER columns.  Also, remove all mentions of 
COLLATE NOCASE in your index definitions.  Collation methods should be 
set in the table definition, not in indexes, except for some unusual 
situations.


This should increase your speed relative to your original timings.  If 
it slows things down, something else I haven't spotted is wrong.


We've found it now takes around 10% longer to do the queries than 
before.


That is understandable given the incorrect advice I gave you before.

In another post you report some strange timing problems with no simple 
explanation.  When I get such things I suspect database corruption or 
hardware problems and run an integrity_check.  But with a 60Gig 
database I might think twice.


Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Very, very slow commits - Possibly solved

2018-07-31 Thread Rob Willett

 file misses from the OS 
cache. I've yet to validate that theory. It could also be that something 
we do in messing with the sqlite_sequence table and that data is being 
inserted into holes somewhere.


15. We also have looked at an older piece of code and we *think* it 
makes an assumption that data is held in contiguous rows (or it could be 
that the query is poorly written and that we need to look at the 
indexes). The code isn't so obvious as to look at row_ids and work with 
that, but its a hint that were still chasing down.


16. We did notice that running analyse actually made the database 
perform worse than before. This was due to it using a specific index 
before the analyse and then afterwards it used an automatic covering 
index. We then created a real index to get it working correctly again.


17. We're now going back to our 60GB database to try and work through 
the whole process again to see if we can confirm our hypothesis.


Rob

On 31 Jul 2018, at 15:40, Chris Locke wrote:

I've been following this thread with interest, but this just doesn't 
make

sense...

 Logically speaking SQLite shouldn't notice the difference in row 
order,

but things do slow down,

even with analyse.


Are you accessing each row via its ID?  Even so, that should still be
indexed.
I thought you were simply adding records into the database - I'm 
failing to

grasp how this is slowing down in the new database.


Thanks,
Chris



On Tue, Jul 31, 2018 at 3:30 PM Rob Willett 


wrote:


Dear all,

We think we have now found the issue with the slow commits.

We believe this is due to an inherent (and old) defect in our 
database
design. We think our original design has an implicit ordering of rows 
in
a table, when the table is only increasing this flaw in the design 
isn't

apparent.

However when we started deduping the table AND we copied rows from 
one
table to another to move things around, we changed the underlying 
order

of rows. Sqlite handles the design change BUT the flaw in our design
becomes apparent as we keep moving the data around and data gets 
mixed

up. The database slows down when we create a second table with an
identical structure to the first table, copy the data into the new
table, drop the old and then when we rename the old table to the new
table, things appear to slow down. Logically speaking SQLite 
shouldn't
notice the difference in row order, but things do slow down, even 
with

analyse.

We think that a better index definition could solve the problem for 
us,

a better database design would, but thats a tricky problem.

We're now going back to our 60GB database and start from scratch to 
see

if we can create the issue (now we think we know what it is).

Thanks to everybody who contributed ideas, we appreciate the help.

Rob

On 31 Jul 2018, at 15:19, Rob Willett wrote:


Simon,

As an exercise we have just added in COLLATE NOCASE to our integer
columns.

Whoops! We thought this would make no difference but its added extra
70% to our processing speeds.

We've now got to the stage where we can make changes quickly, so 
we'll

back that change out and go back to the integer defn without COLLATE
NOCASE.

Rob

On 31 Jul 2018, at 14:59, Rob Willett wrote:


Simon,

Apologies for taking so long to get back, we've been building a 
test

system and its taken a long time.

We're just getting round to trying your ideas out to see what
difference they make,

We've created a new table based on your ideas, moved the collate 
into

the table, analysed the database. We did **not** add COLLATE NOCASE
to the columns which are defined as integers. Would that make a
difference?

We've found it now takes around 10% longer to do the queries than
before.

Rob



Please try moving your COLLATE clauses into the table definition.
e.g. instead of

CREATE UNIQUE INDEX "Disruptions_Idx3" ON Disruptions 
("version"

COLLATE NOCASE ASC, "Disruption_id" COLLATE NOCASE ASC, "location"
COLLATE NOCASE ASC);

Your table definition should have

 "version" integer NOT NULL COLLATE NOCASE,
 "Disruption_id" INTEGER NOT NULL COLLATE NOCASE,
...
 "location" integer NOT NULL COLLATE NOCASE,

and the index should be

CREATE UNIQUE INDEX "Disruptions_Idx3" ON Disruptions
("version" ASC, "Disruption_id" ASC, "location" ASC);

Once data has been entered, do ANALYZE.  This step may take a long
time.

Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.o

Re: [sqlite] Very, very slow commits - Possibly solved

2018-07-31 Thread Rob Willett


Dear all,

We think we have now found the issue with the slow commits.

We believe this is due to an inherent (and old) defect in our database 
design. We think our original design has an implicit ordering of rows in 
a table, when the table is only increasing this flaw in the design isn't 
apparent.


However when we started deduping the table AND we copied rows from one 
table to another to move things around, we changed the underlying order 
of rows. Sqlite handles the design change BUT the flaw in our design 
becomes apparent as we keep moving the data around and data gets mixed 
up. The database slows down when we create a second table with an 
identical structure to the first table, copy the data into the new 
table, drop the old and then when we rename the old table to the new 
table, things appear to slow down. Logically speaking SQLite shouldn't 
notice the difference in row order, but things do slow down, even with 
analyse.


We think that a better index definition could solve the problem for us, 
a better database design would, but thats a tricky problem.


We're now going back to our 60GB database and start from scratch to see 
if we can create the issue (now we think we know what it is).


Thanks to everybody who contributed ideas, we appreciate the help.

Rob

On 31 Jul 2018, at 15:19, Rob Willett wrote:


Simon,

As an exercise we have just added in COLLATE NOCASE to our integer 
columns.


Whoops! We thought this would make no difference but its added extra 
70% to our processing speeds.


We've now got to the stage where we can make changes quickly, so we'll 
back that change out and go back to the integer defn without COLLATE 
NOCASE.


Rob

On 31 Jul 2018, at 14:59, Rob Willett wrote:


Simon,

Apologies for taking so long to get back, we've been building a test 
system and its taken a long time.


We're just getting round to trying your ideas out to see what 
difference they make,


We've created a new table based on your ideas, moved the collate into 
the table, analysed the database. We did **not** add COLLATE NOCASE 
to the columns which are defined as integers. Would that make a 
difference?


We've found it now takes around 10% longer to do the queries than 
before.


Rob


Please try moving your COLLATE clauses into the table definition.  
e.g. instead of


CREATE UNIQUE INDEX "Disruptions_Idx3" ON Disruptions ("version" 
COLLATE NOCASE ASC, "Disruption_id" COLLATE NOCASE ASC, "location" 
COLLATE NOCASE ASC);


Your table definition should have

 "version" integer NOT NULL COLLATE NOCASE,
 "Disruption_id" INTEGER NOT NULL COLLATE NOCASE,
...
 "location" integer NOT NULL COLLATE NOCASE,

and the index should be

CREATE UNIQUE INDEX "Disruptions_Idx3" ON Disruptions
("version" ASC, "Disruption_id" ASC, "location" ASC);

Once data has been entered, do ANALYZE.  This step may take a long 
time.


Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Very, very slow commits

2018-07-31 Thread Rob Willett


Simon,

As an exercise we have just added in COLLATE NOCASE to our integer 
columns.


Whoops! We thought this would make no difference but its added extra 70% 
to our processing speeds.


We've now got to the stage where we can make changes quickly, so we'll 
back that change out and go back to the integer defn without COLLATE 
NOCASE.


Rob

On 31 Jul 2018, at 14:59, Rob Willett wrote:


Simon,

Apologies for taking so long to get back, we've been building a test 
system and its taken a long time.


We're just getting round to trying your ideas out to see what 
difference they make,


We've created a new table based on your ideas, moved the collate into 
the table, analysed the database. We did **not** add COLLATE NOCASE to 
the columns which are defined as integers. Would that make a 
difference?


We've found it now takes around 10% longer to do the queries than 
before.


Rob


Please try moving your COLLATE clauses into the table definition.  
e.g. instead of


CREATE UNIQUE INDEX "Disruptions_Idx3" ON Disruptions ("version" 
COLLATE NOCASE ASC, "Disruption_id" COLLATE NOCASE ASC, "location" 
COLLATE NOCASE ASC);


Your table definition should have

 "version" integer NOT NULL COLLATE NOCASE,
 "Disruption_id" INTEGER NOT NULL COLLATE NOCASE,
...
 "location" integer NOT NULL COLLATE NOCASE,

and the index should be

CREATE UNIQUE INDEX "Disruptions_Idx3" ON Disruptions
("version" ASC, "Disruption_id" ASC, "location" ASC);

Once data has been entered, do ANALYZE.  This step may take a long 
time.


Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Very, very slow commits

2018-07-31 Thread Rob Willett


Simon,

Apologies for taking so long to get back, we've been building a test 
system and its taken a long time.


We're just getting round to trying your ideas out to see what difference 
they make,


We've created a new table based on your ideas, moved the collate into 
the table, analysed the database. We did **not** add COLLATE NOCASE to 
the columns which are defined as integers. Would that make a difference?


We've found it now takes around 10% longer to do the queries than 
before.


Rob


Please try moving your COLLATE clauses into the table definition.  
e.g. instead of


CREATE UNIQUE INDEX "Disruptions_Idx3" ON Disruptions ("version" 
COLLATE NOCASE ASC, "Disruption_id" COLLATE NOCASE ASC, "location" 
COLLATE NOCASE ASC);


Your table definition should have

 "version" integer NOT NULL COLLATE NOCASE,
 "Disruption_id" INTEGER NOT NULL COLLATE NOCASE,
...
 "location" integer NOT NULL COLLATE NOCASE,

and the index should be

CREATE UNIQUE INDEX "Disruptions_Idx3" ON Disruptions
("version" ASC, "Disruption_id" ASC, "location" ASC);

Once data has been entered, do ANALYZE.  This step may take a long 
time.


Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Very, very slow commits

2018-07-30 Thread Rob Willett


Droedel,

Fortunately we have no performance issues in production with the 60GB 
database. These issues came out in testing (which is what testing is 
for).


We're investigating the newly generated ID's as we speak or we will be 
once we get our replica production system setup.


sqlite_analyser has been a problem for us. We've struggled to get a 
build for it.


Rob

On 30 Jul 2018, at 13:49, Droedel wrote:


Hi Rob,
Answers are in the text below

On Mon, Jul 30, 2018, at 13:45, Rob Willett wrote:

Droedel,

We don't think there are significant read access. The database is a
single database on a single thread on a single process. The only 
access
to it is a Perl script that logs the incoming information. We never 
have

two accesses at the same time.


Can you also _measure_ read access, preferably on system level on your 
production database, e.g. by using iostat ? I've seen cases where 
(other, non-SQLite) databases had unexpected disk access patterns due 
to an application error.


We have a nagging feeling (and thats all it is) about the 
autoincrement
value. We do use that feature in the table, but we have deleted so 
many


[snip]
If both databases use autoincrement, then performance should be 
similar (or at least that's what I expect). Can you easily check if 
the newly generated IDs are as expected and larger than any existing 
ID in your table ?



We did wonder if we are filling up pages in the middle or something.
However we expected the vacuum and analyse to sort this out. Now its


[snip]
sqlite3_analyzer can give some measurements, e.g. unused bytes on 
index pages.



We've built the replica test system now and we're going to have some
initial checks and get some benchmarks in place.

It could be an interesting and exciting ride :)


Sure. It's always fun learning something new. But it's less fun in 
full production when customers are yelling :-(


Regards,

Droedel




Rob


On 30 Jul 2018, at 12:32, Droedel wrote:


Hi Rob,

Is there significant read access (iostat: r/s) during these slow
writes ? If yes, it might be due to a small cache, requiring the
database to read (index) pages before updating them.

And is the data you're adding in both databases (large/small) added 
at

the end of the table using the autoincrement, or do you insert some
items in the middle ? I'm not a SQLite performance expert, but in
other databases performance can be very different because in the
former case fewer pages must be updated.

Microsoft SQL Server has something called "fill factor", basically 
the

max percentage of an index page that is used during initial fill,
which helps avoiding too many page shuffling in the index when extra
items are added. Disadvantage: it makes DBAs argue endlessly about 
the

best fill factor ;-) Maybe there's something similar possible in
SQLite but I couldn't find a pragma for this.

Oh, and do both databases have the same page size, preferably 4K ?

Regards,

Droedel



On Mon, Jul 30, 2018, at 12:51, Rob Willett wrote:

Droedel,

Thanks for the comprehensive reply. We have actually done all of
this.

The system has been running for 2-3 years and we have taken the
opportunity to try and prune the database from 60GB down to 600MB.
Currently the live system is working OK with a 60GB database, but 
our

test system (which is smaller) is struggling with 600MB.

The system has a restriction of IOPS as it's a Virtual Private
Server.
Technically it's running Ubuntu 16.04 under OpenVZ. We can get
69MB/sec
with a disk to disk copy, which isn't brilliant if we had sustained
disk
traffic which we don't.

We log approx 600 - 800 items of around 3-5K every five minutes.
These
600-800 items are mainly an insert into a single table, there are
other
things happening as well, but this is the bulk of the work. We can
see
that the -wal files grow a small amount (4-5MB) just before the
commit.
It then takes 7 seconds to execute the commit. This is the bit that
we're struggling with. We know we can get circa 70MB/sec data
throughput, so this should take a fraction of a second. Now SQLite
needs
to work out which pages to commit so thats a little longer, but we
know
SQLite is fast, so that shouldn't take 7 seconds on the small
database
as it doesn't take that long on the large 60GB database. Thats the
puzzling bit, the large database is quick, the small one slow.

We have no logging turned on, we can turn SQL logging on at the DBI
level but that turns a 20 sec run into a 2-3 minute run as it
captures
everything :) Nothing in the log files gives us any concern (apart
from
the really long commit time). Simon Slavin suggested dropping the
indexes which we did, that turned the commit into a fast commit, so
its
something to do with the indexes but we can't see what.

What we are now doing is going back to the very beginning:

1. We built a replacement system yesterday with 8GB memory and 8
cores
and 150GB disk space. Its virtualised (ESXI) but under our control.
2. We've instal

Re: [sqlite] Very, very slow commits

2018-07-30 Thread Rob Willett


Warren,

The hardware is different, at the time we didn't want to spin up a 
complete production replica as thats quite expensive. We used a smaller 
machine, both have the same type of back end spining raid array, but we 
would think that writing 4-5MB of changed data back shouldn't take 7 
seconds. We had seen far better performance on the slower machine 
earlier in testing.


We will go back to step one and work our way through step by step from 
60GB to 600Mb as our thinking is that we have somehow screwed our 
database up.


Rob


On 30 Jul 2018, at 13:29, Warren Young wrote:

On Jul 30, 2018, at 5:53 AM, Rob Willett 
 wrote:


I would wonder why writing the data to a 60GB database and doing a 
commit is fast and writing exactly the same data to the 600MB 
database is different. The programs for doing it are the same, the 
database schema is identical.


I assume the hardware is different.  Is that not the case?

If the small DB is on a machine with a spinning disk but the large DB 
is on a machine with either an SSD or a many-spindled RAID, there’s 
your key difference.

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Very, very slow commits

2018-07-30 Thread Rob Willett


Warren,

On 30 Jul 2018, at 12:28, Warren Young wrote:

On Jul 30, 2018, at 4:51 AM, Rob Willett 
 wrote:


The system has been running for 2-3 years


Has performance changed over that span?  Which direction?


Performance hasn't changed on the large 60GB data database. its pretty 
consistent.




we have taken the opportunity to try and prune the database from 60GB 
down to 600MB.


SQLite’s speed is only weakly affected by database size.  A starting 
guess is log2(N) where N is file size.


Since your data access didn’t get 10x faster from your 100x size 
drop, you can fairly guess that the speed problem isn’t due to the 
inherent time required to traverse tree-based data structures in 
SQLite.




The reason for the pruning is not for the benefit of SQLite, it's for 
our own administration. Backing up 60GB files is hard work, we're also 
struggling to fire up replicas, so we wanted to look at the database as 
a whole and get the size down. The 99% reduction is great, we were 
hoping for a 50% to 80% reduction.


The system has a restriction of IOPS as it's a Virtual Private 
Server. Technically it's running Ubuntu 16.04 under OpenVZ. We can 
get 69MB/sec with a disk to disk copy, which isn't brilliant if we 
had sustained disk traffic which we don’t.


I am sensing a spinning disk.  (Else, throughput should be a lot 
higher.)


I suspect that is the case. However 69MB/sec is adequate for the 60GB 
database and for normal usage. Our commits are fine on the 60GB 
database, just not the 600MB one.


SQLite takes data durability (the D in ACID) seriously, unlike most 
other software applications, so it is uncommon in that it flushes each 
transaction to disk before proceeding with further writes to that 
table.


A flush to disk takes a full disk rotation, and due to the way 
SQLite’s journal operates, each transaction requires two flushes.  
That means that with a 7200 RPM disk, you can get a maximum of 60 
transactions per second per table with SQLite.


I agree with your logic, but I would wonder why writing the data to a 
60GB database and doing a commit is fast and writing exactly the same 
data to the 600MB database is different. The programs for doing it are 
the same, the database schema is identical.




Sound familiar?

If I’ve guessed the problem correctly, the solutions are:

1. Batch multiple writes in a transaction.


All ready done.



2. Switch to an SSD.


Not an option in the short term for production BUT the test system we 
have setup has an SSD. This may skew the results though. Its difficult 
to get an identical system setup, but the other option is a VMWare 
Fusion system on a Mac with a spinning disk. Most of our local systems 
are SSD, we have a Mac with a spinning disk for backup.




3. Use multiple tables and/or multiple DB files.  In your case, I’d 
suggest one SQLite DB per sensor, with one thread per sensor, each of 
which keeps one of the SQLite DBs open continuously.  That way, a 
blocked DB conn won’t block any other writers.


We have one process that reads a single file in every 5 mins. No need 
for multiple databases or multiple threads.




Those solutions are given in order of ease of application and cost of 
implementation.



Nothing in the log files gives us any concern


Have you tried SQLite’s new .expert feature?


No, but we will now :)


   https://sqlite.org/cli.html#index_recommendations_sqlite_expert_


dropping a very large table is really, really, really slow.


If you put your main data table in a file of its own, you can quickly 
“drop” the table by just closing the DB and removing the DB file 
from disk.


Thats what we will do on our test system.



When you then recreate the DB file with a fresh schema, it’s 
effectively defragged/vacuumed as well.


We have a copy of the 60GB data file (took a ling time to download) 
locally now. We will use this (or rather a copy) to start the testing.


Thanks very much for the thoughtful and useful comments.

Rob


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Very, very slow commits

2018-07-30 Thread Rob Willett


Droedel,

We don't think there are significant read access. The database is a 
single database on a single thread on a single process. The only access 
to it is a Perl script that logs the incoming information. We never have 
two accesses at the same time.


We have a nagging feeling (and thats all it is) about the autoincrement 
value. We do use that feature in the table, but we have deleted so many 
rows and we may have messed something up as we move data from table to 
table. We did read https://www.sqlite.org/autoinc.html a week or so ago 
and we wonder if the fact we have deleted circa 199,500,000 rows from 
the table of 200,000,000 rows we may have cocked it up somehow. We have 
never directly accessed the sqite_internal table (except to read) but 
this page makes us wonder if we have missed an error somewhere.


We did wonder if we are filling up pages in the middle or something. 
However we expected the vacuum and analyse to sort this out. Now its 
entirely possible we have had this problem before now hence we are going 
to go back to the very beginning with out 61GB database and take it step 
by step so we understand what is happening. Whats frustrating is that 
the 61GB database seems fast and the 600MB seems slow yet they both have 
the same database schema. We feel that we have made an error somewhere 
but only now discovered it.


We've built the replica test system now and we're going to have some 
initial checks and get some benchmarks in place.


It could be an interesting and exciting ride :)

Rob


On 30 Jul 2018, at 12:32, Droedel wrote:


Hi Rob,

Is there significant read access (iostat: r/s) during these slow 
writes ? If yes, it might be due to a small cache, requiring the 
database to read (index) pages before updating them.


And is the data you're adding in both databases (large/small) added at 
the end of the table using the autoincrement, or do you insert some 
items in the middle ? I'm not a SQLite performance expert, but in 
other databases performance can be very different because in the 
former case fewer pages must be updated.


Microsoft SQL Server has something called "fill factor", basically the 
max percentage of an index page that is used during initial fill, 
which helps avoiding too many page shuffling in the index when extra 
items are added. Disadvantage: it makes DBAs argue endlessly about the 
best fill factor ;-) Maybe there's something similar possible in 
SQLite but I couldn't find a pragma for this.


Oh, and do both databases have the same page size, preferably 4K ?

Regards,

Droedel



On Mon, Jul 30, 2018, at 12:51, Rob Willett wrote:

Droedel,

Thanks for the comprehensive reply. We have actually done all of 
this.


The system has been running for 2-3 years and we have taken the
opportunity to try and prune the database from 60GB down to 600MB.
Currently the live system is working OK with a 60GB database, but our
test system (which is smaller) is struggling with 600MB.

The system has a restriction of IOPS as it's a Virtual Private 
Server.
Technically it's running Ubuntu 16.04 under OpenVZ. We can get 
69MB/sec
with a disk to disk copy, which isn't brilliant if we had sustained 
disk

traffic which we don't.

We log approx 600 - 800 items of around 3-5K every five minutes. 
These
600-800 items are mainly an insert into a single table, there are 
other
things happening as well, but this is the bulk of the work. We can 
see
that the -wal files grow a small amount (4-5MB) just before the 
commit.

It then takes 7 seconds to execute the commit. This is the bit that
we're struggling with. We know we can get circa 70MB/sec data
throughput, so this should take a fraction of a second. Now SQLite 
needs
to work out which pages to commit so thats a little longer, but we 
know
SQLite is fast, so that shouldn't take 7 seconds on the small 
database

as it doesn't take that long on the large 60GB database. Thats the
puzzling bit, the large database is quick, the small one slow.

We have no logging turned on, we can turn SQL logging on at the DBI
level but that turns a 20 sec run into a 2-3 minute run as it 
captures
everything :) Nothing in the log files gives us any concern (apart 
from

the really long commit time). Simon Slavin suggested dropping the
indexes which we did, that turned the commit into a fast commit, so 
its

something to do with the indexes but we can't see what.

What we are now doing is going back to the very beginning:

1. We built a replacement system yesterday with 8GB memory and 8 
cores

and 150GB disk space. Its virtualised (ESXI) but under our control.
2. We've installed a copy of the old 60GB database on the new system.
3. We're going to benchmark the new system over a couple of thousand
runs to see what the average time is.
4. We'll then work our way through the deduping of the database step 
by
step to see when the commit time blow up. This will take a few days 
as
working out the duplications of 200,000,000 rows isn't

Re: [sqlite] Very, very slow commits

2018-07-30 Thread Rob Willett


Droedel,

Thanks for the comprehensive reply. We have actually done all of this.

The system has been running for 2-3 years and we have taken the 
opportunity to try and prune the database from 60GB down to 600MB. 
Currently the live system is working OK with a 60GB database, but our 
test system (which is smaller) is struggling with 600MB.


The system has a restriction of IOPS as it's a Virtual Private Server. 
Technically it's running Ubuntu 16.04 under OpenVZ. We can get 69MB/sec 
with a disk to disk copy, which isn't brilliant if we had sustained disk 
traffic which we don't.


We log approx 600 - 800 items of around 3-5K every five minutes. These 
600-800 items are mainly an insert into a single table, there are other 
things happening as well, but this is the bulk of the work. We can see 
that the -wal files grow a small amount (4-5MB) just before the commit. 
It then takes 7 seconds to execute the commit. This is the bit that 
we're struggling with. We know we can get circa 70MB/sec data 
throughput, so this should take a fraction of a second. Now SQLite needs 
to work out which pages to commit so thats a little longer, but we know 
SQLite is fast, so that shouldn't take 7 seconds on the small database 
as it doesn't take that long on the large 60GB database. Thats the 
puzzling bit, the large database is quick, the small one slow.


We have no logging turned on, we can turn SQL logging on at the DBI 
level but that turns a 20 sec run into a 2-3 minute run as it captures 
everything :) Nothing in the log files gives us any concern (apart from 
the really long commit time). Simon Slavin suggested dropping the 
indexes which we did, that turned the commit into a fast commit, so its 
something to do with the indexes but we can't see what.


What we are now doing is going back to the very beginning:

1. We built a replacement system yesterday with 8GB memory and 8 cores 
and 150GB disk space. Its virtualised (ESXI) but under our control.

2. We've installed a copy of the old 60GB database on the new system.
3. We're going to benchmark the new system over a couple of thousand 
runs to see what the average time is.
4. We'll then work our way through the deduping of the database step by 
step to see when the commit time blow up. This will take a few days as 
working out the duplications of 200,000,000 rows isn't that quick :) As 
we found out, dropping a very large table is really, really, really 
slow.
5. We'll apply some of the ideas that people have suggested since 
yesterday to see if they work, but I'm keen that we have a repeatable 
problem that we solve rather than we use a scatter gun approach to 
fixing it. We think SQLite is well written so we figure the problem is 
ours to solve rather than simply blaming the software.



Thanks

Rob

On 30 Jul 2018, at 11:11, Droedel wrote:


Hi,

When having bad performance, I usually first try to find out if the 
slowness is due to disk througput (sequential), slow random access or 
something else. In Linux, try "iostat -xtc 5". Do this with and 
without your application writing to disk.


If you see high CPU %iowait and high %util on your disk, then disk is 
the bottleneck. If not: start profiling / analyzing other bottlenecks 
(CPU / network / ...)


If the disk throughput (wMB/s) is close to your normal sequential 
throughput (69 MB/s): try to write less data or get a faster disk.
If the disk troughput is low, but high numbers of writes (w/s): 
there's too much seeking / too many small writes to your disk. Page 
cache too small ? Checkpointing too often ?


Sometimes this kind of problems is caused by other applications 
(logging / ...) causing too much baseload. %util should be low when 
your application isn't running.


Just my 2 cents.

Kind regards,

Droedel


On Sun, Jul 29, 2018, at 10:14, Rob Willett wrote:

Hi,

Background

We've been refactoring our database to reduce the size of it. Through
some simple logic we've managed to pull out 99% of the data to reduce
the size from 51GB down to approx 600MB. This logic has been to 
remove

rows that are almost the same but not quite identical. As with all
things, the thinking was the difficult bit, the execution somewhat
easier.

As part of the testing we've been doing, we've now hit on an odd and
weird problem to do with the COMMIT statement. A commit of a few 
hundred
(circa 600-800) rows takes approx 7 seconds whereas before we never 
even

noticed it, though we now know it was two seconds before. Each row is
probably 1-2K of data, so its not very much at all.

Details of what we have tried:

1. We've turned synchronous  on and off

PRAGMA synchronous=ON

and thats not made any difference.

2. We are using and have been using WAL mode for years.

PRAGMA journal_mode;
journal_mode
wal

3. We've tested that the server copies OK, we get a consistent 
69MB/sec.

This is not as fast we would like, but it's the same across all our
virtual servers.

4. We've tested the commit on our existing 60G

Re: [sqlite] Very, very slow commits

2018-07-29 Thread Rob Willett


Thanks for the mail.

We ran analyze with no indexes, made no difference.

We recreated the indexes and ran analyze again. The very long commit is 
back, this time it took 14 secs :)


It does appear that the indexes have something to do with this whereby 
they have not been an issue to now.


Rob

On 29 Jul 2018, at 11:45, J. King wrote:

On July 29, 2018 5:47:29 AM EDT, Rob Willett 
 wrote:

John,

Thanks for the prompt reply and a very good question..

We've dropped the indexes and the commit is now very quick, approx 
two

seconds

However the overall performance of the run is much the same as other
areas of the code are now significantly slower, whereas before they
were
quick.

Where were you going with that question?


Might ANALYZE help?
--
J. King
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Very, very slow commits

2018-07-29 Thread Rob Willett


Anton,

Dropped the indexes and created them without order.

We'll need to look at what your second para means. It could be a major 
and massive change.


Rob



On 29 Jul 2018, at 11:52, Djelf wrote:


Rob,

Try creating indexes without order.

Or, try to make a column with a hash of the values entering the index 
and
search for the value by the index of. This will complicate the logic 
of your

program, but it will decrease the volume of the database, and possibly
significantly speed up both reading and writing.

---
Anton Azanov



--
Sent from: http://sqlite.1065341.n5.nabble.com/
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Very, very slow commits

2018-07-29 Thread Rob Willett


John,

Thanks for the prompt reply and a very good question..

We've dropped the indexes and the commit is now very quick, approx two 
seconds


However the overall performance of the run is much the same as other 
areas of the code are now significantly slower, whereas before they were 
quick.


Where were you going with that question?

Thanks

Rob

On 29 Jul 2018, at 10:33, John Found wrote:


What is the performance without the indexes?

On Sun, 29 Jul 2018 10:20:11 +0100
"Rob Willett"  wrote:


Update 1

We've copied the disruptions table to a new table, dropped the old
table, copied the new table back in and recreated all the indexes.

Exactly the sam commit performance.

We've also tracked the -shm and -wal files and they are around 5MB in
size.

Mmmm tricky (as Deepthought said).

Rob

On 29 Jul 2018, at 9:14, Rob Willett wrote:


Hi,

Background

We've been refactoring our database to reduce the size of it. 
Through
some simple logic we've managed to pull out 99% of the data to 
reduce
the size from 51GB down to approx 600MB. This logic has been to 
remove

rows that are almost the same but not quite identical. As with all
things, the thinking was the difficult bit, the execution somewhat
easier.

As part of the testing we've been doing, we've now hit on an odd and
weird problem to do with the COMMIT statement. A commit of a few
hundred (circa 600-800) rows takes approx 7 seconds whereas before 
we

never even noticed it, though we now know it was two seconds before.
Each row is probably 1-2K of data, so its not very much at all.

Details of what we have tried:

1. We've turned synchronous  on and off

PRAGMA synchronous=ON

and thats not made any difference.

2. We are using and have been using WAL mode for years.

PRAGMA journal_mode;
journal_mode
wal

3. We've tested that the server copies OK, we get a consistent
69MB/sec. This is not as fast we would like, but it's the same 
across

all our virtual servers.

4. We've tested the commit on our existing 60GB database and it 
takes
2 seconds, which is longer than we thought it would be. The server 
for

the 60GB database is a large VPS with 8GB/8 cores and runs Ubuntu
14.04. The server we are testing on is a 2GB/2 core test server
running Ubuntu 16.04. Whilst the test server is smaller, we wouldn't
expect it to take 3 times longer to do a commit.

5. The code is identical across the servers. We are running Perl and
the DBI module. The code for doing a commit in Perl::DBI is
 $dbh->do("COMMIT");

  We are getting the expected performance elsewhere on the system 
and

in the code. It's just the commit that is taking a long time.

6. The code we are committing is adding 600-800 lines to a table 
that

used to be 200,000,000 rows in size. It's now 400,000 lines in size.
We are wondering if the deletion of the lines has had an impact we
didn't expect. We have vacuumed and analysed the database.

The schema for the table we insert into is

CREATE TABLE IF NOT EXISTS "Disruptions" (
 "id" INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT,
 "version" integer NOT NULL,
 "Disruption_id" INTEGER NOT NULL,
 "status" integer NOT NULL,
 "severity" integer NOT NULL,
 "levelOfInterest" integer NOT NULL,
 "category" integer NOT NULL,
 "subCategory" integer NOT NULL,
 "startTime" TEXT NOT NULL,
 "endTime" text NOT NULL,
 "location" integer NOT NULL,
 "corridor" integer NOT NULL,
 "comments" integer NOT NULL,
 "currentUpdate" integer NOT NULL,
 "remarkTime" TEXT NOT NULL,
 "lastModTime" TEXT NOT NULL,
 "CauseAreaPointX" real NOT NULL,
 "CauseAreaPointY" real NOT NULL,
 "Direction" TEXT
);
CREATE INDEX "Disruptions_Idx1" ON Disruptions ("location" COLLATE
NOCASE ASC, "corridor" COLLATE NOCASE ASC, "status" COLLATE NOCASE
ASC, "category" COLLATE NOCASE ASC, "severity" COLLATE NOCASE ASC,
"levelOfInterest" COLLATE NOCASE ASC, "version" COLLATE NOCASE ASC,
"subCategory" COLLATE NOCASE ASC, "startTime" COLLATE NOCASE ASC);
CREATE UNIQUE INDEX "Disruptions_Idx2" ON Disruptions 
("Disruption_id"

COLLATE NOCASE ASC, "version" COLLATE NOCASE ASC, "category" COLLATE
NOCASE ASC, "subCategory" COLLATE NOCASE ASC);
CREATE UNIQUE INDEX "Disruptions_Idx3" ON Disruptions ("version"
COLLATE NOCASE ASC, "Disruption_id" COLLATE NOCASE ASC, "location"
COLLATE NOCASE ASC);
CREATE INDEX Disruptions_Idx5 ON Disruptions ("status",
"Disruption_id", "Severity", "levelOfInterest", "category&quo

Re: [sqlite] Very, very slow commits

2018-07-29 Thread Rob Willett


Update 1

We've copied the disruptions table to a new table, dropped the old 
table, copied the new table back in and recreated all the indexes.


Exactly the sam commit performance.

We've also tracked the -shm and -wal files and they are around 5MB in 
size.


Mmmm tricky (as Deepthought said).

Rob

On 29 Jul 2018, at 9:14, Rob Willett wrote:


Hi,

Background

We've been refactoring our database to reduce the size of it. Through 
some simple logic we've managed to pull out 99% of the data to reduce 
the size from 51GB down to approx 600MB. This logic has been to remove 
rows that are almost the same but not quite identical. As with all 
things, the thinking was the difficult bit, the execution somewhat 
easier.


As part of the testing we've been doing, we've now hit on an odd and 
weird problem to do with the COMMIT statement. A commit of a few 
hundred (circa 600-800) rows takes approx 7 seconds whereas before we 
never even noticed it, though we now know it was two seconds before. 
Each row is probably 1-2K of data, so its not very much at all.


Details of what we have tried:

1. We've turned synchronous  on and off

PRAGMA synchronous=ON

and thats not made any difference.

2. We are using and have been using WAL mode for years.

PRAGMA journal_mode;
journal_mode
wal

3. We've tested that the server copies OK, we get a consistent 
69MB/sec. This is not as fast we would like, but it's the same across 
all our virtual servers.


4. We've tested the commit on our existing 60GB database and it takes 
2 seconds, which is longer than we thought it would be. The server for 
the 60GB database is a large VPS with 8GB/8 cores and runs Ubuntu 
14.04. The server we are testing on is a 2GB/2 core test server 
running Ubuntu 16.04. Whilst the test server is smaller, we wouldn't 
expect it to take 3 times longer to do a commit.


5. The code is identical across the servers. We are running Perl and 
the DBI module. The code for doing a commit in Perl::DBI is

 $dbh->do("COMMIT");

  We are getting the expected performance elsewhere on the system and 
in the code. It's just the commit that is taking a long time.


6. The code we are committing is adding 600-800 lines to a table that 
used to be 200,000,000 rows in size. It's now 400,000 lines in size. 
We are wondering if the deletion of the lines has had an impact we 
didn't expect. We have vacuumed and analysed the database.


The schema for the table we insert into is

CREATE TABLE IF NOT EXISTS "Disruptions" (
 "id" INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT,
 "version" integer NOT NULL,
 "Disruption_id" INTEGER NOT NULL,
 "status" integer NOT NULL,
 "severity" integer NOT NULL,
 "levelOfInterest" integer NOT NULL,
 "category" integer NOT NULL,
 "subCategory" integer NOT NULL,
 "startTime" TEXT NOT NULL,
 "endTime" text NOT NULL,
 "location" integer NOT NULL,
 "corridor" integer NOT NULL,
 "comments" integer NOT NULL,
 "currentUpdate" integer NOT NULL,
 "remarkTime" TEXT NOT NULL,
 "lastModTime" TEXT NOT NULL,
 "CauseAreaPointX" real NOT NULL,
 "CauseAreaPointY" real NOT NULL,
 "Direction" TEXT
);
CREATE INDEX "Disruptions_Idx1" ON Disruptions ("location" COLLATE 
NOCASE ASC, "corridor" COLLATE NOCASE ASC, "status" COLLATE NOCASE 
ASC, "category" COLLATE NOCASE ASC, "severity" COLLATE NOCASE ASC, 
"levelOfInterest" COLLATE NOCASE ASC, "version" COLLATE NOCASE ASC, 
"subCategory" COLLATE NOCASE ASC, "startTime" COLLATE NOCASE ASC);
CREATE UNIQUE INDEX "Disruptions_Idx2" ON Disruptions ("Disruption_id" 
COLLATE NOCASE ASC, "version" COLLATE NOCASE ASC, "category" COLLATE 
NOCASE ASC, "subCategory" COLLATE NOCASE ASC);
CREATE UNIQUE INDEX "Disruptions_Idx3" ON Disruptions ("version" 
COLLATE NOCASE ASC, "Disruption_id" COLLATE NOCASE ASC, "location" 
COLLATE NOCASE ASC);
CREATE INDEX Disruptions_Idx5 ON Disruptions ("status", 
"Disruption_id", "Severity", "levelOfInterest", "category", 
"subCategory", "version");


We have checked that this schema is consistent across the databases.

We're about to recreate the table to see if that makes a difference.

Any help or advice welcomed.

Thanks

Rob
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] Very, very slow commits

2018-07-29 Thread Rob Willett


Hi,

Background

We've been refactoring our database to reduce the size of it. Through 
some simple logic we've managed to pull out 99% of the data to reduce 
the size from 51GB down to approx 600MB. This logic has been to remove 
rows that are almost the same but not quite identical. As with all 
things, the thinking was the difficult bit, the execution somewhat 
easier.


As part of the testing we've been doing, we've now hit on an odd and 
weird problem to do with the COMMIT statement. A commit of a few hundred 
(circa 600-800) rows takes approx 7 seconds whereas before we never even 
noticed it, though we now know it was two seconds before. Each row is 
probably 1-2K of data, so its not very much at all.


Details of what we have tried:

1. We've turned synchronous  on and off

PRAGMA synchronous=ON

and thats not made any difference.

2. We are using and have been using WAL mode for years.

PRAGMA journal_mode;
journal_mode
wal

3. We've tested that the server copies OK, we get a consistent 69MB/sec. 
This is not as fast we would like, but it's the same across all our 
virtual servers.


4. We've tested the commit on our existing 60GB database and it takes 2 
seconds, which is longer than we thought it would be. The server for the 
60GB database is a large VPS with 8GB/8 cores and runs Ubuntu 14.04. The 
server we are testing on is a 2GB/2 core test server running Ubuntu 
16.04. Whilst the test server is smaller, we wouldn't expect it to take 
3 times longer to do a commit.


5. The code is identical across the servers. We are running Perl and the 
DBI module. The code for doing a commit in Perl::DBI is

 $dbh->do("COMMIT");

  We are getting the expected performance elsewhere on the system and 
in the code. It's just the commit that is taking a long time.


6. The code we are committing is adding 600-800 lines to a table that 
used to be 200,000,000 rows in size. It's now 400,000 lines in size. We 
are wondering if the deletion of the lines has had an impact we didn't 
expect. We have vacuumed and analysed the database.


The schema for the table we insert into is

CREATE TABLE IF NOT EXISTS "Disruptions" (
 "id" INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT,
 "version" integer NOT NULL,
 "Disruption_id" INTEGER NOT NULL,
 "status" integer NOT NULL,
 "severity" integer NOT NULL,
 "levelOfInterest" integer NOT NULL,
 "category" integer NOT NULL,
 "subCategory" integer NOT NULL,
 "startTime" TEXT NOT NULL,
 "endTime" text NOT NULL,
 "location" integer NOT NULL,
 "corridor" integer NOT NULL,
 "comments" integer NOT NULL,
 "currentUpdate" integer NOT NULL,
 "remarkTime" TEXT NOT NULL,
 "lastModTime" TEXT NOT NULL,
 "CauseAreaPointX" real NOT NULL,
 "CauseAreaPointY" real NOT NULL,
 "Direction" TEXT
);
CREATE INDEX "Disruptions_Idx1" ON Disruptions ("location" COLLATE 
NOCASE ASC, "corridor" COLLATE NOCASE ASC, "status" COLLATE NOCASE ASC, 
"category" COLLATE NOCASE ASC, "severity" COLLATE NOCASE ASC, 
"levelOfInterest" COLLATE NOCASE ASC, "version" COLLATE NOCASE ASC, 
"subCategory" COLLATE NOCASE ASC, "startTime" COLLATE NOCASE ASC);
CREATE UNIQUE INDEX "Disruptions_Idx2" ON Disruptions ("Disruption_id" 
COLLATE NOCASE ASC, "version" COLLATE NOCASE ASC, "category" COLLATE 
NOCASE ASC, "subCategory" COLLATE NOCASE ASC);
CREATE UNIQUE INDEX "Disruptions_Idx3" ON Disruptions ("version" COLLATE 
NOCASE ASC, "Disruption_id" COLLATE NOCASE ASC, "location" COLLATE 
NOCASE ASC);
CREATE INDEX Disruptions_Idx5 ON Disruptions ("status", "Disruption_id", 
"Severity", "levelOfInterest", "category", "subCategory", "version");


We have checked that this schema is consistent across the databases.

We're about to recreate the table to see if that makes a difference.

Any help or advice welcomed.

Thanks

Rob
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Any ideas on how long it takes to drop a large table...rather an important question now...

2018-07-17 Thread Rob Willett


Simon,

No, we knew about normalisation, the database is normalised, that part 
of the design hasn't changed in years.


The reasons for the massive reduction in database size is that we worked 
out how to handle repeating lines of data that change just enough that 
we thought they needed to be kept. With some small tweaks we could 
eliminate data that we thought we needed to preserve. We had assumed 
that we needed to keep everything, but by going back in, looking at what 
actual data we now had, we worked out we could store a small amount of 
extra information that stores some state change but this meant we could 
dump most of the database.


It was something that we hadn't appreciated a few years ago, and the 
ever increasing database meant we had to do something about the DB size.


Its taken a few months of work, mainly in a dark room with a wet towel 
on our foreheads, to do the tiny changes necessary to make a big 
difference.


In hindsight the fact that SQLite is so easy and fast to use was a 
slight disadvantage to us, it allowed us to be a bit quick and dirty 
with designs, when we should have thought through some of the issues. 
However thats what startups are like ;)


I think all the explanations to date have been helpful and appreciate 
the time take to answer, we're going to think a little more carefully 
about how we manage our database on a VM. I'm in the process of moving 
home so the only real hardware (of any note) is sitting in storage so 
the only testing we can do is either virtualised or on Mac laptops. 
Neither of which will help us in this instance.


Rob

On 17 Jul 2018, at 13:16, Simon Slavin wrote:


On 17 Jul 2018, at 1:10pm, R Smith  wrote:

What kind of data did you store (maybe some examples if possible) 
that you could condense it by ~99% like that?


I think that the OP's organisation discovered the 'relational' part of 
RDBMS and implemented normalisation.


To Rob Willett: Ryan Smith's explanation is better than mine.  Please 
read his post and ignore my wooly one.  I tried three ways to get 
across the sequential-access vs. random-access point and wasn't really 
happy with anything I wrote.


Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Any ideas on how long it takes to drop a large table...rather an important question now...

2018-07-17 Thread Rob Willett


Dan,

Thanks for the reply.

pragma secure_delete;
1
sqlite>

Which is a bit of a surprise as we have never seen it before. We had to 
look it up using your reference.


No idea why that is set (if 1 means it is).

Should we simply ctrl-C the deletion, turn secure_delete off and then 
try again? My working assumption is that since there is a -wal file we 
are safe to do this.


Rob

On 16 Jul 2018, at 21:07, Dan Kennedy wrote:


On 07/17/2018 03:01 AM, rob.sql...@robertwillett.com wrote:

Hi,

We're doing a massive tidy on our database which is approx 50GB.

One table is approx 49GB of that 50GB which we need to delete as we 
have

recorded the information in a far better format thats approx 99% more
efficient. If only we had been this clever when we started

We've just 'dropped' the table and were assuming that dropping the 
table

would be quite quick. It's not. So far we've been waiting for 30 mins
and nothing has come back yet. We can see that the -wal file is upto
2.5GB. We have this terrible feeling that it'll need to get to 49GB 
or
so before the table gets dropped. We can just about handle that in 
the

current filesystem.

We're now getting nervous about dropping this table. We had assumed 
that

it would be a really quick and easy operation based on absolutely no
checking whatsoever. When we looked on line all we could see was a
reference to a very, very old and outdated page
(https://sqlite.org/speed.html) which talks about speed and at the
bottom of that page the comments

"SQLite is slower than the other databases when it comes to dropping
tables. This probably is because when SQLite drops a table, it has to 
go
through and erase the records in the database file that deal with 
that

table. MySQL and PostgreSQL, on the other hand, use separate files to
represent each table so they can drop a table simply by deleting a 
file,

which is much faster.

On the other hand, dropping tables is not a very common operation so 
if

SQLite takes a little longer, that is not seen as a big problem."

Is this still the case, is it going to take a long time? If we assume
that the table is 49GB then will we need to wait until the -wal file 
is

at 49GB. By our estimates thats approximately 10 hours away.

Any help or idea or suggestions welcomed, but please be quick.


Is secure-delete turned on?

  https://www.sqlite.org/pragma.html#pragma_secure_delete

Dan.

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Any ideas on how long it takes to drop a large table...rather an important question now...

2018-07-17 Thread Rob Willett


Simon,

No, we knew about normalisation, the database is normalised, that part 
of the design hasn't changed in years.


The reasons for the massive reduction in database size is that we worked 
out how to handle repeating lines of data that change just enough that 
we thought they needed to be kept. With some small tweaks we could 
eliminate data that we thought we needed to preserve. We had assumed 
that we needed to keep everything, but by going back in, looking at what 
actual data we now had, we worked out we could store a small amount of 
extra information that stores some state change but this meant we could 
dump most of the database.


It was something that we hadn't appreciated a few years ago, and the 
ever increasing database meant we had to do something about the DB size.


Its taken a few months of work, mainly in a dark room with a wet towel 
on our foreheads, to do the tiny changes necessary to make a big 
difference.


In hindsight the fact that SQLite is so easy and fast to use was a 
slight disadvantage to us, it allowed us to be a bit quick and dirty 
with designs, when we should have thought through some of the issues. 
However thats what startups are like ;)


I think all the explanations to date have been helpful and appreciate 
the time take to answer, we're going to think a little more carefully 
about how we manage our database on a VM. I'm in the process of moving 
home so the only real hardware (of any note) is sitting in storage so 
the only testing we can do is either virtualised or on Mac laptops. 
Neither of which will help us in this instance.


Rob

On 17 Jul 2018, at 13:16, Simon Slavin wrote:


On 17 Jul 2018, at 1:10pm, R Smith  wrote:

What kind of data did you store (maybe some examples if possible) 
that you could condense it by ~99% like that?


I think that the OP's organisation discovered the 'relational' part of 
RDBMS and implemented normalisation.


To Rob Willett: Ryan Smith's explanation is better than mine.  Please 
read his post and ignore my wooly one.  I tried three ways to get 
across the sequential-access vs. random-access point and wasn't really 
happy with anything I wrote.


Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Any ideas on how long it takes to drop a large table...rather an important question now...

2018-07-17 Thread Rob Willett


Simon,

Thanks for this. You make some interesting points about cache hits and 
misses. Up until now, we hadn't seen a massive performance hit, whilst 
some database operations didn't work as fast as we would like them to, 
nothing works as fast we would like.


I'm not familiar with these issues with virtualisation. The VPI we use 
has OpenVZ at it's core (no pun intended). We can see a little on the 
internet about this (well one article specifically about MySQL and 
OpenVZ) but nothing else.


We are in the process of evaluating whether to move to a different VPS, 
some use OpenVZ, some use KVM, internally we use VMWare ESXI. Very 
little real DB work is done on the actual metal, its all virtualised.


Do you have any pointers to stuff we can read up on? We don't understand 
your comment "SQLite spent that whole time accessing your 50GB database 
file in an apparently random order." and would like to try and get more 
information about it.


We have the option of moving off OpenVZ to KVM or ESXI so if we can 
understand the issue, we can make a more informed choice. Whilst our DB 
has dropped down to 500MB we still need to do a fair amount of testing 
and checking to make sure there are no unusual edge cases (or bugs) 
based before we promote it to live.


Many thanks

Rob

On 17 Jul 2018, at 12:05, Simon Slavin wrote:

On 17 Jul 2018, at 8:37am, Rob Willett  
wrote:


I suspect that part of the issue is the VPS provider we use has a 
rate limiter on IOPS which is not normally an issue for us, but that 
might have slowed it down somewhat. However I don't think that it 
would have slowed it down by hours.


Actually I think VPS had a lot to do with the time the operation took. 
 Any kind of virtual machine takes a terrible hit during the sort of 
storage access involved in dropping the table.


SQLite spent that whole time accessing your 50GB database file in an 
apparently random order.   So you had nine hours of cache misses, 
causing the virtual machine to continually write virtual pages back to 
real storage and read other pages into memory.  Virtual systems are 
optimized for cache hits, not cache misses.


I can't prove it without a lot of pointless data manipulation on your 
type of VPS, but I think you found its least optimal operation.  The 
good part is that now your database is less than 1GB long you're going 
to see a massive increase in speed since the whole database may well 
fit in the cache of your virtual machine.


Must remember in future, when people report unusually slow operations, 
to ask whether they're using a virtual machine or real hardware.


Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Any ideas on how long it takes to drop a large table...rather an important question now...

2018-07-17 Thread Rob Willett

The top table finally finished around 07:00 this morning. It took 9.5 
hours to drop the circa 190M rows.


I suspect that part of the issue is the VPS provider we use has a rate 
limiter on IOPS which is not normally an issue for us, but that might 
have slowed it down somewhat. However I don't think that it would have 
slowed it down by hours.


Lessons to be learnt here:

1. Design your database correctly at the start. This was our biggest 
problem, we failed to understand how large a single table would grow and 
didn't manage the administration correctly and in good time.
2. Think carefully about the data you actually need. We managed to throw 
away 99% of our data and combined it together to get a 50GB database 
down to 500MB.
3. Dropping a table may not be the fastest method, creating all the 
other smaller tables in a separate database would have been far quicker. 
There's always more than one way to solve a problem.

4. Turn off secure_delete if it's on.
5. Use the mailing group for advice. Its brilliant!
6. Assess, plan, act. Thats a diving phrase but very pertinent here, 
assess what the problem actually is before you plan and execute.


Now we have other checks to do to assure us the database is accurate for 
our needs.


Thanks to all the people who offered advice and help

Rob

On 17 Jul 2018, at 7:02, Rob Willett wrote:


Richard,

Thanks for the comprehensive update.

We appreciate that there are tradeoffs and that dropping a table is a 
rarer operation than most others. The upside of the Sqlite design is 
that we treat a database as a single file which we copy around with 
ease. You cannot do that with many other databases, this has a 
downside though of when you need to drop a table.


A less charitable person would have stated that it was our own fault 
for designing the database and using it with approx 190M records in a 
single table with little thought about how we would manage it.  We now 
know this was a dumb idea, however you live and learn.


The impact of this is simply time. We've worked out how to shrink the 
database from 50GB to approx 1GB, after this we can prune the database 
on a weekly basis so this never becomes an issue.


I supposed the key question is would we give up the portability of 
sqlite for a fast table drop? The answer is No.


Rob

On 16 Jul 2018, at 22:59, Richard Hipp wrote:


On 7/16/18, Rob Willett  wrote:


It does look as if one of sqlite's weaknesses is dropping very, very
large tables.



Right.  If every table were stored in a separate file, a DROP TABLE
could be translated to a relatively fast unlink().  But then a
database would be a directory full of files, rather than a single
file, which would undermine a lot of the usefulness of SQLite.
Furthermore, DROP TABLE is an uncommon operation.  We prefer to
provide a database where all content is contained in a single file 
and

that is optimized SELECTs and for INSERTs and DELETEs of a subset of
the rows in the table, as that seems to be far more useful in most
cases.  There are always engineering trade-offs.  SQLite provides
single-file databases and fast queries in exchange for slower DROP
TABLEs.

A filesystem is able to implement unlink() quickly because it has the
entire partition available for laying out the locations of files.
Space can be allocated to a file in large chunks, which makes
deallocation faster.  In other words, in a filesystem, the files can
be positioned sparsely on disk, with lots of unused space in between
the various file to accommodate growth . But in SQLite, the goal is 
to

keep the database file size as small as possible, and to not have
unnecessary unused pages in the middle of the database.  Hence, space
for tables much be allocated in relatively small 1-page chunks, which
means that there will be a large number of chunks to deallocate when
dropping a large table.  If you were to construct a filesystem that
tried to keep all file content tightly packed at the beginning of a
partition (say, for example, to make shrinking of a partition faster)
then unlink() would necessarily be slower on that filesystem.  That
seems like a bad engineering trade-off for a filesystem, but it is
(for most applications) a good trade-off for a database such as
SQLite.

You can work around this.  If you have one or more tables that you
think might need to be DROP-ed frequently, then consider storing them
in a separate database files and ATTACH-ing those separate database
files to your main database.  Then, to drop those tables, you can
DETACH them, and then unlink() the corresponding database file.  That
should go much faster.

--
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin

Re: [sqlite] Any ideas on how long it takes to drop a large table...rather an important question now...

2018-07-17 Thread Rob Willett


Richard,

Thanks for the comprehensive update.

We appreciate that there are tradeoffs and that dropping a table is a 
rarer operation than most others. The upside of the Sqlite design is 
that we treat a database as a single file which we copy around with 
ease. You cannot do that with many other databases, this has a downside 
though of when you need to drop a table.


A less charitable person would have stated that it was our own fault for 
designing the database and using it with approx 190M records in a single 
table with little thought about how we would manage it.  We now know 
this was a dumb idea, however you live and learn.


The impact of this is simply time. We've worked out how to shrink the 
database from 50GB to approx 1GB, after this we can prune the database 
on a weekly basis so this never becomes an issue.


I supposed the key question is would we give up the portability of 
sqlite for a fast table drop? The answer is No.


Rob

On 16 Jul 2018, at 22:59, Richard Hipp wrote:


On 7/16/18, Rob Willett  wrote:


It does look as if one of sqlite's weaknesses is dropping very, very
large tables.



Right.  If every table were stored in a separate file, a DROP TABLE
could be translated to a relatively fast unlink().  But then a
database would be a directory full of files, rather than a single
file, which would undermine a lot of the usefulness of SQLite.
Furthermore, DROP TABLE is an uncommon operation.  We prefer to
provide a database where all content is contained in a single file and
that is optimized SELECTs and for INSERTs and DELETEs of a subset of
the rows in the table, as that seems to be far more useful in most
cases.  There are always engineering trade-offs.  SQLite provides
single-file databases and fast queries in exchange for slower DROP
TABLEs.

A filesystem is able to implement unlink() quickly because it has the
entire partition available for laying out the locations of files.
Space can be allocated to a file in large chunks, which makes
deallocation faster.  In other words, in a filesystem, the files can
be positioned sparsely on disk, with lots of unused space in between
the various file to accommodate growth . But in SQLite, the goal is to
keep the database file size as small as possible, and to not have
unnecessary unused pages in the middle of the database.  Hence, space
for tables much be allocated in relatively small 1-page chunks, which
means that there will be a large number of chunks to deallocate when
dropping a large table.  If you were to construct a filesystem that
tried to keep all file content tightly packed at the beginning of a
partition (say, for example, to make shrinking of a partition faster)
then unlink() would necessarily be slower on that filesystem.  That
seems like a bad engineering trade-off for a filesystem, but it is
(for most applications) a good trade-off for a database such as
SQLite.

You can work around this.  If you have one or more tables that you
think might need to be DROP-ed frequently, then consider storing them
in a separate database files and ATTACH-ing those separate database
files to your main database.  Then, to drop those tables, you can
DETACH them, and then unlink() the corresponding database file.  That
should go much faster.

--
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Any ideas on how long it takes to drop a large table...rather an important question now...

2018-07-16 Thread Rob Willett


Jay,

I think your approach would have been quicker. Ten hours so far and it's 
still deleting the table :(


Rob

On 17 Jul 2018, at 2:16, Jay Kreibich wrote:


On Jul 16, 2018, at 3:01 PM, rob.sql...@robertwillett.com wrote:

Hi,

We're doing a massive tidy on our database which is approx 50GB.

One table is approx 49GB of that 50GB which we need to delete as we 
have recorded the information in a far better format thats approx 99% 
more efficient. If only we had been this clever when we started….



Depending on the complexity of the scheme, it sounds like it might be 
easier to just dump the other tables, re-create the database without 
this jumbo table, and re-import the other data.


 -j


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Any ideas on how long it takes to drop a large table...rather an important question now...

2018-07-16 Thread Rob Willett

Mmm It's still taking an awful long time, though the -wal file is 
very small.


It does look as if one of sqlite's weaknesses is dropping very, very 
large tables.


Oh well, lets let it run through the night.

Rob

On 16 Jul 2018, at 21:25, Rob Willett wrote:


Dan,

We've killed the process (kill -9). Fired up sqlite3 again, closed it 
down normally. The -wal files were removed.


Fired up sqlite3 again, turned off secure_delete, started to drop the 
table again, reniced it down (long story to do with IOPS and our VPS 
provider)


-wal file is empty, the -shm file is a steady 32768 bytes.

No idea if anything is actually happening now, but at least we don't 
have a massive -wal file.


Suspect this could take the night (its 21:24 in London), so we'll 
detach the screen session and come back later.


Many thanks for the help.

Rob

On 16 Jul 2018, at 21:17, Dan Kennedy wrote:


On 07/17/2018 03:12 AM, Rob Willett wrote:

Dan,

Thanks for the reply.

pragma secure_delete;
1
sqlite>

Which is a bit of a surprise as we have never seen it before. We had 
to

look it up using your reference.

No idea why that is set (if 1 means it is).

Should we simply ctrl-C the deletion, turn secure_delete off and 
then
try again? My working assumption is that since there is a -wal file 
we

are safe to do this.


That sounds fine. Without secure-delete, the wal file should be 
pretty small.


You can always safely kill a process in the middle of an SQLite 
transaction. So long as you don't do anything foolhardy like deleting 
wal or journal files afterwards.


Dan.








Rob

On 16 Jul 2018, at 21:07, Dan Kennedy wrote:


On 07/17/2018 03:01 AM, rob.sql...@robertwillett.com wrote:

Hi,

We're doing a massive tidy on our database which is approx 50GB.

One table is approx 49GB of that 50GB which we need to delete as 
we have
recorded the information in a far better format thats approx 99% 
more

efficient. If only we had been this clever when we started

We've just 'dropped' the table and were assuming that dropping the 
table
would be quite quick. It's not. So far we've been waiting for 30 
mins
and nothing has come back yet. We can see that the -wal file is 
upto
2.5GB. We have this terrible feeling that it'll need to get to 
49GB or
so before the table gets dropped. We can just about handle that in 
the

current filesystem.

We're now getting nervous about dropping this table. We had 
assumed that
it would be a really quick and easy operation based on absolutely 
no

checking whatsoever. When we looked on line all we could see was a
reference to a very, very old and outdated page
(https://sqlite.org/speed.html) which talks about speed and at the
bottom of that page the comments

"SQLite is slower than the other databases when it comes to 
dropping
tables. This probably is because when SQLite drops a table, it has 
to go
through and erase the records in the database file that deal with 
that
table. MySQL and PostgreSQL, on the other hand, use separate files 
to
represent each table so they can drop a table simply by deleting a 
file,

which is much faster.

On the other hand, dropping tables is not a very common operation 
so if

SQLite takes a little longer, that is not seen as a big problem."

Is this still the case, is it going to take a long time? If we 
assume
that the table is 49GB then will we need to wait until the -wal 
file is

at 49GB. By our estimates thats approximately 10 hours away.

Any help or idea or suggestions welcomed, but please be quick.


Is secure-delete turned on?

  https://www.sqlite.org/pragma.html#pragma_secure_delete

Dan.

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Any ideas on how long it takes to drop a large table...rather an important question now...

2018-07-16 Thread Rob Willett


Dan,

We've killed the process (kill -9). Fired up sqlite3 again, closed it 
down normally. The -wal files were removed.


Fired up sqlite3 again, turned off secure_delete, started to drop the 
table again, reniced it down (long story to do with IOPS and our VPS 
provider)


-wal file is empty, the -shm file is a steady 32768 bytes.

No idea if anything is actually happening now, but at least we don't 
have a massive -wal file.


Suspect this could take the night (its 21:24 in London), so we'll detach 
the screen session and come back later.


Many thanks for the help.

Rob

On 16 Jul 2018, at 21:17, Dan Kennedy wrote:


On 07/17/2018 03:12 AM, Rob Willett wrote:

Dan,

Thanks for the reply.

pragma secure_delete;
1
sqlite>

Which is a bit of a surprise as we have never seen it before. We had 
to

look it up using your reference.

No idea why that is set (if 1 means it is).

Should we simply ctrl-C the deletion, turn secure_delete off and then
try again? My working assumption is that since there is a -wal file 
we

are safe to do this.


That sounds fine. Without secure-delete, the wal file should be pretty 
small.


You can always safely kill a process in the middle of an SQLite 
transaction. So long as you don't do anything foolhardy like deleting 
wal or journal files afterwards.


Dan.








Rob

On 16 Jul 2018, at 21:07, Dan Kennedy wrote:


On 07/17/2018 03:01 AM, rob.sql...@robertwillett.com wrote:

Hi,

We're doing a massive tidy on our database which is approx 50GB.

One table is approx 49GB of that 50GB which we need to delete as we 
have
recorded the information in a far better format thats approx 99% 
more

efficient. If only we had been this clever when we started

We've just 'dropped' the table and were assuming that dropping the 
table
would be quite quick. It's not. So far we've been waiting for 30 
mins
and nothing has come back yet. We can see that the -wal file is 
upto
2.5GB. We have this terrible feeling that it'll need to get to 49GB 
or
so before the table gets dropped. We can just about handle that in 
the

current filesystem.

We're now getting nervous about dropping this table. We had assumed 
that
it would be a really quick and easy operation based on absolutely 
no

checking whatsoever. When we looked on line all we could see was a
reference to a very, very old and outdated page
(https://sqlite.org/speed.html) which talks about speed and at the
bottom of that page the comments

"SQLite is slower than the other databases when it comes to 
dropping
tables. This probably is because when SQLite drops a table, it has 
to go
through and erase the records in the database file that deal with 
that
table. MySQL and PostgreSQL, on the other hand, use separate files 
to
represent each table so they can drop a table simply by deleting a 
file,

which is much faster.

On the other hand, dropping tables is not a very common operation 
so if

SQLite takes a little longer, that is not seen as a big problem."

Is this still the case, is it going to take a long time? If we 
assume
that the table is 49GB then will we need to wait until the -wal 
file is

at 49GB. By our estimates thats approximately 10 hours away.

Any help or idea or suggestions welcomed, but please be quick.


Is secure-delete turned on?

  https://www.sqlite.org/pragma.html#pragma_secure_delete

Dan.

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Any ideas on how long it takes to drop a large table...rather an important question now...

2018-07-16 Thread Rob Willett


Dan,

Thanks for the reply.

pragma secure_delete;
1
sqlite>

Which is a bit of a surprise as we have never seen it before. We had to 
look it up using your reference.


No idea why that is set (if 1 means it is).

Should we simply ctrl-C the deletion, turn secure_delete off and then 
try again? My working assumption is that since there is a -wal file we 
are safe to do this.


Rob

On 16 Jul 2018, at 21:07, Dan Kennedy wrote:


On 07/17/2018 03:01 AM, rob.sql...@robertwillett.com wrote:

Hi,

We're doing a massive tidy on our database which is approx 50GB.

One table is approx 49GB of that 50GB which we need to delete as we 
have

recorded the information in a far better format thats approx 99% more
efficient. If only we had been this clever when we started

We've just 'dropped' the table and were assuming that dropping the 
table

would be quite quick. It's not. So far we've been waiting for 30 mins
and nothing has come back yet. We can see that the -wal file is upto
2.5GB. We have this terrible feeling that it'll need to get to 49GB 
or
so before the table gets dropped. We can just about handle that in 
the

current filesystem.

We're now getting nervous about dropping this table. We had assumed 
that

it would be a really quick and easy operation based on absolutely no
checking whatsoever. When we looked on line all we could see was a
reference to a very, very old and outdated page
(https://sqlite.org/speed.html) which talks about speed and at the
bottom of that page the comments

"SQLite is slower than the other databases when it comes to dropping
tables. This probably is because when SQLite drops a table, it has to 
go
through and erase the records in the database file that deal with 
that

table. MySQL and PostgreSQL, on the other hand, use separate files to
represent each table so they can drop a table simply by deleting a 
file,

which is much faster.

On the other hand, dropping tables is not a very common operation so 
if

SQLite takes a little longer, that is not seen as a big problem."

Is this still the case, is it going to take a long time? If we assume
that the table is 49GB then will we need to wait until the -wal file 
is

at 49GB. By our estimates thats approximately 10 hours away.

Any help or idea or suggestions welcomed, but please be quick.


Is secure-delete turned on?

  https://www.sqlite.org/pragma.html#pragma_secure_delete

Dan.

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] I'm getting a compiler error on a SQLIte build on Ubuntu 16.04 (which surprises me)

2018-07-10 Thread Rob Willett


David,

Thanks for the reply.

We have the latest build-essential

root@preprod1:/jambuster/src/sqlite-src-324/tool# apt-get install 
build-essential

Reading package lists... Done
Building dependency tree
Reading state information... Done
build-essential is already the newest version (12.1ubuntu2).
build-essential set to manually installed.
The following package was automatically installed and is no longer 
required:

  libllvm5.0
Use 'apt autoremove' to remove it.
0 upgraded, 0 newly installed, 0 to remove and 0 not upgraded.

Indeed we have the same gcc version

gcc (Ubuntu 5.4.0-6ubuntu1~16.04.10) 5.4.0 20160609

It works without -O2, do you have optimisations left on?

Rob

On 10 Jul 2018, at 2:18, David Burgess wrote:


I'm using gcc (Ubuntu 5.4.0-6ubuntu1~16.04.10) 5.4.0 20160609

on 64 bit.

I have built both 64 and 32 bit versions from sqlite-src-324
No problems.

Check your build-essential install?

On Tue, Jul 10, 2018 at 1:05 AM, Dan Kennedy  
wrote:

On 07/09/2018 09:26 PM, Rob Willett wrote:


Dan,

Thanks for the information. We did try that first, failed and just 
assumed

we were idiots and went to the the other download.

Just downloaded it again, so we can check and be on the safe side, 
and we

get the same issue.

cp /jambuster/src/sqlite-src-324/ext/session/sqlite3session.h .
./libtool --mode=compile --tag=CC gcc -g -O2 -DSQLITE_OS_UNIX=1 -I.
-I/jambuster/src/sqlite-src-324/src
-I/jambuster/src/sqlite-src-324/ext/rtree
-I/jambuster/src/sqlite-src-324/ext/icu
-I/jambuster/src/sqlite-src-324/ext/fts3
-I/jambuster/src/sqlite-src-324/ext/async
-I/jambuster/src/sqlite-src-324/ext/session 
-D_HAVE_SQLITE_CONFIG_H

-DBUILD_sqlite -DNDEBUG -I/usr/include/tcl8.6 -DSQLITE_THREADSAFE=1
-DSQLITE_HAVE_ZLIB=1  -DSQLITE_TEMP_STORE=1 -c sqlite3.c
libtool: compile:  gcc -g -O2 -DSQLITE_OS_UNIX=1 -I.
-I/jambuster/src/sqlite-src-324/src
-I/jambuster/src/sqlite-src-324/ext/rtree
-I/jambuster/src/sqlite-src-324/ext/icu
-I/jambuster/src/sqlite-src-324/ext/fts3
-I/jambuster/src/sqlite-src-324/ext/async
-I/jambuster/src/sqlite-src-324/ext/session 
-D_HAVE_SQLITE_CONFIG_H

-DBUILD_sqlite -DNDEBUG -I/usr/include/tcl8.6 -DSQLITE_THREADSAFE=1
-DSQLITE_HAVE_ZLIB=1 -DSQLITE_TEMP_STORE=1 -c sqlite3.c  -fPIC -DPIC 
-o

.libs/sqlite3.o
gcc: internal compiler error: Killed (program cc1)
Please submit a full bug report,
with preprocessed source if appropriate.
See  for instructions.
Makefile:715: recipe for target 'sqlite3.lo' failed

./configure seems OK, tcl looks OK (which was our worry), we think 
there
is a genuine bug in the compiler. . We still need the 
sql_analyzer
program as well :) Its one of those days, I have bad compilers and 
the UK

government could fall.




After running [./configure], open the generated Makefile and remove 
all the

"-O2" switches you can find. Then run [make sqlite3_analyzer].

You'll get an un-optimized binary of course, but there's a good 
chance this
will bypass the compiler bug. And it won't make much difference 
unless your

databases are really big.

Dan.









Rob

On 9 Jul 2018, at 15:02, Dan Kennedy wrote:


On 07/09/2018 03:56 PM, rob.sql...@robertwillett.com wrote:


Hi

I'm trying to build the SQLite packages from the source. I 
**only** want
it to get a copy of sqlite_analyse which for some reason doesn't 
appear to
be easily available as a compiled option. The download package 
doesn't work
on Ubuntu 16.04 and as far as we can see, there are no other 
downloadable
binaries that don't come from China. For some reason, I don't 
really want to

download something from Beijing



The Linux binaries on sqlite.org are for 32-bit systems. For them 
to work

on Ubuntu 16.04, you probably need to install multiarch support.

Or, to build sqlite3_analyzer from source you need the full source
distribution:

  https://www.sqlite.org/2018/sqlite-src-324.zip

Dan.





The full steps I followed are at the very bottom of this e-mail. 
They

are a direct copy from my terminal output.

**TLDR Summary**

1. OS is a reasonably fresh (< 4 weeks old) Ubuntu 16.04 LTS 
Server
running under OpenVZ on a commercial ISP. It's patched to the 
current
levels. I don't think this is a problem as we don't have any other 
issues

but...

2. SQLite3 is already installed as the normal (and older) SQLIte 
that is

distributed with Ubuntu 16.04 LTS

3. gcc is installed. I have created a small C program to test that 
it

can compile (Hello World) and it's fine. (gcc (Ubuntu
5.4.0-6ubuntu1~16.04.10) 5.4.0 20160609)

4. Downloaded 
https://www.sqlite.org/2018/sqlite-autoconf-324.tar.gz

through wget.

5. Checked the sha1sum and it matches the sqlite page.

6. Untarred the file into it's own directory

7. cd'ed to directory and ran ./configure.

8. Checked output of configure and nothing seemed wrong. Details 
are

below.

9. Ran make and get a compiler failure!  (see below). I'm 
gobsmacked
that the compiler has failed

Re: [sqlite] I'm getting a compiler error on a SQLIte build on Ubuntu 16.04 (which surprises me)

2018-07-10 Thread Rob Willett


Dan,

Many thanks for this. I removed two instances of -O2 from the Makefile 
and, lo and behold, it compiles.


Sadly my database to check is approx 80GB which could be interesting. 
However I'll take a working (but slower) program over a faster (but not 
working) application every time :)


Rob

On 9 Jul 2018, at 16:05, Dan Kennedy wrote:


On 07/09/2018 09:26 PM, Rob Willett wrote:

Dan,

Thanks for the information. We did try that first, failed and just 
assumed we were idiots and went to the the other download.


Just downloaded it again, so we can check and be on the safe side, 
and we get the same issue.


cp /jambuster/src/sqlite-src-324/ext/session/sqlite3session.h .
./libtool --mode=compile --tag=CC gcc -g -O2 -DSQLITE_OS_UNIX=1 -I. 
-I/jambuster/src/sqlite-src-324/src 
-I/jambuster/src/sqlite-src-324/ext/rtree 
-I/jambuster/src/sqlite-src-324/ext/icu 
-I/jambuster/src/sqlite-src-324/ext/fts3 
-I/jambuster/src/sqlite-src-324/ext/async 
-I/jambuster/src/sqlite-src-324/ext/session 
-D_HAVE_SQLITE_CONFIG_H -DBUILD_sqlite -DNDEBUG -I/usr/include/tcl8.6 
-DSQLITE_THREADSAFE=1 -DSQLITE_HAVE_ZLIB=1  -DSQLITE_TEMP_STORE=1 -c 
sqlite3.c
libtool: compile:  gcc -g -O2 -DSQLITE_OS_UNIX=1 -I. 
-I/jambuster/src/sqlite-src-324/src 
-I/jambuster/src/sqlite-src-324/ext/rtree 
-I/jambuster/src/sqlite-src-324/ext/icu 
-I/jambuster/src/sqlite-src-324/ext/fts3 
-I/jambuster/src/sqlite-src-324/ext/async 
-I/jambuster/src/sqlite-src-324/ext/session 
-D_HAVE_SQLITE_CONFIG_H -DBUILD_sqlite -DNDEBUG -I/usr/include/tcl8.6 
-DSQLITE_THREADSAFE=1 -DSQLITE_HAVE_ZLIB=1 -DSQLITE_TEMP_STORE=1 -c 
sqlite3.c  -fPIC -DPIC -o .libs/sqlite3.o

gcc: internal compiler error: Killed (program cc1)
Please submit a full bug report,
with preprocessed source if appropriate.
See  for instructions.
Makefile:715: recipe for target 'sqlite3.lo' failed

./configure seems OK, tcl looks OK (which was our worry), we think 
there is a genuine bug in the compiler. . We still need the 
sql_analyzer program as well :) Its one of those days, I have bad 
compilers and the UK government could fall.



After running [./configure], open the generated Makefile and remove 
all the "-O2" switches you can find. Then run [make sqlite3_analyzer].


You'll get an un-optimized binary of course, but there's a good chance 
this will bypass the compiler bug. And it won't make much difference 
unless your databases are really big.


Dan.








Rob

On 9 Jul 2018, at 15:02, Dan Kennedy wrote:


On 07/09/2018 03:56 PM, rob.sql...@robertwillett.com wrote:

Hi

I'm trying to build the SQLite packages from the source. I **only** 
want it to get a copy of sqlite_analyse which for some reason 
doesn't appear to be easily available as a compiled option. The 
download package doesn't work on Ubuntu 16.04 and as far as we can 
see, there are no other downloadable binaries that don't come from 
China. For some reason, I don't really want to download something 
from Beijing


The Linux binaries on sqlite.org are for 32-bit systems. For them to 
work on Ubuntu 16.04, you probably need to install multiarch 
support.


Or, to build sqlite3_analyzer from source you need the full source 
distribution:


  https://www.sqlite.org/2018/sqlite-src-324.zip

Dan.





The full steps I followed are at the very bottom of this e-mail. 
They are a direct copy from my terminal output.


**TLDR Summary**

1. OS is a reasonably fresh (< 4 weeks old) Ubuntu 16.04 LTS Server 
running under OpenVZ on a commercial ISP. It's patched to the 
current levels. I don't think this is a problem as we don't have 
any other issues but...


2. SQLite3 is already installed as the normal (and older) SQLIte 
that is distributed with Ubuntu 16.04 LTS


3. gcc is installed. I have created a small C program to test that 
it can compile (Hello World) and it's fine. (gcc (Ubuntu 
5.4.0-6ubuntu1~16.04.10) 5.4.0 20160609)


4. Downloaded 
https://www.sqlite.org/2018/sqlite-autoconf-324.tar.gz through 
wget.


5. Checked the sha1sum and it matches the sqlite page.

6. Untarred the file into it's own directory

7. cd'ed to directory and ran ./configure.

8. Checked output of configure and nothing seemed wrong. Details 
are below.


9. Ran make and get a compiler failure!  (see below). I'm 
gobsmacked that the compiler has failed to be honest. So my first 
assumption is that we have cocked up something. However we've 
googled and checked and can't see anybody with a similar issue. I 
suspect that one of the Define statements is not playing nicely but 
not wholly sure which one OR we are missing a package that needs to 
be installed, e.g. TCL x,y or z.


root@preprod1:/jambuster/src/sqlite/sqlite-autoconf-324# make
/bin/bash ./libtool  --tag=CC   --mode=compile gcc 
-DPACKAGE_NAME=\"sqlite\" -DPACKAGE_TARNAME=\"sqlite\" 
-DPACKAGE_VERSION=\"3.24.0\" -DPACKAGE_STRING=\"sqlite\ 3.24.0\" 
-DPACK

Re: [sqlite] I'm getting a compiler error on a SQLIte build on Ubuntu 16.04 (which surprises me)

2018-07-09 Thread Rob Willett


Dan,

Thanks for the information. We did try that first, failed and just 
assumed we were idiots and went to the the other download.


Just downloaded it again, so we can check and be on the safe side, and 
we get the same issue.


cp /jambuster/src/sqlite-src-324/ext/session/sqlite3session.h .
./libtool --mode=compile --tag=CC gcc -g -O2 -DSQLITE_OS_UNIX=1 -I. 
-I/jambuster/src/sqlite-src-324/src 
-I/jambuster/src/sqlite-src-324/ext/rtree 
-I/jambuster/src/sqlite-src-324/ext/icu 
-I/jambuster/src/sqlite-src-324/ext/fts3 
-I/jambuster/src/sqlite-src-324/ext/async 
-I/jambuster/src/sqlite-src-324/ext/session -D_HAVE_SQLITE_CONFIG_H 
-DBUILD_sqlite -DNDEBUG -I/usr/include/tcl8.6 -DSQLITE_THREADSAFE=1   
-DSQLITE_HAVE_ZLIB=1  -DSQLITE_TEMP_STORE=1 -c sqlite3.c
libtool: compile:  gcc -g -O2 -DSQLITE_OS_UNIX=1 -I. 
-I/jambuster/src/sqlite-src-324/src 
-I/jambuster/src/sqlite-src-324/ext/rtree 
-I/jambuster/src/sqlite-src-324/ext/icu 
-I/jambuster/src/sqlite-src-324/ext/fts3 
-I/jambuster/src/sqlite-src-324/ext/async 
-I/jambuster/src/sqlite-src-324/ext/session -D_HAVE_SQLITE_CONFIG_H 
-DBUILD_sqlite -DNDEBUG -I/usr/include/tcl8.6 -DSQLITE_THREADSAFE=1 
-DSQLITE_HAVE_ZLIB=1 -DSQLITE_TEMP_STORE=1 -c sqlite3.c  -fPIC -DPIC -o 
.libs/sqlite3.o

gcc: internal compiler error: Killed (program cc1)
Please submit a full bug report,
with preprocessed source if appropriate.
See  for instructions.
Makefile:715: recipe for target 'sqlite3.lo' failed

./configure seems OK, tcl looks OK (which was our worry), we think there 
is a genuine bug in the compiler. . We still need the sql_analyzer 
program as well :) Its one of those days, I have bad compilers and the 
UK government could fall.


Rob

On 9 Jul 2018, at 15:02, Dan Kennedy wrote:


On 07/09/2018 03:56 PM, rob.sql...@robertwillett.com wrote:

Hi

I'm trying to build the SQLite packages from the source. I **only** 
want it to get a copy of sqlite_analyse which for some reason doesn't 
appear to be easily available as a compiled option. The download 
package doesn't work on Ubuntu 16.04 and as far as we can see, there 
are no other downloadable binaries that don't come from China. For 
some reason, I don't really want to download something from 
Beijing


The Linux binaries on sqlite.org are for 32-bit systems. For them to 
work on Ubuntu 16.04, you probably need to install multiarch support.


Or, to build sqlite3_analyzer from source you need the full source 
distribution:


  https://www.sqlite.org/2018/sqlite-src-324.zip

Dan.





The full steps I followed are at the very bottom of this e-mail. They 
are a direct copy from my terminal output.


**TLDR Summary**

1. OS is a reasonably fresh (< 4 weeks old) Ubuntu 16.04 LTS Server 
running under OpenVZ on a commercial ISP. It's patched to the current 
levels. I don't think this is a problem as we don't have any other 
issues but...


2. SQLite3 is already installed as the normal (and older) SQLIte that 
is distributed with Ubuntu 16.04 LTS


3. gcc is installed. I have created a small C program to test that it 
can compile (Hello World) and it's fine. (gcc (Ubuntu 
5.4.0-6ubuntu1~16.04.10) 5.4.0 20160609)


4. Downloaded 
https://www.sqlite.org/2018/sqlite-autoconf-324.tar.gz through 
wget.


5. Checked the sha1sum and it matches the sqlite page.

6. Untarred the file into it's own directory

7. cd'ed to directory and ran ./configure.

8. Checked output of configure and nothing seemed wrong. Details are 
below.


9. Ran make and get a compiler failure!  (see below). I'm gobsmacked 
that the compiler has failed to be honest. So my first assumption is 
that we have cocked up something. However we've googled and checked 
and can't see anybody with a similar issue. I suspect that one of the 
Define statements is not playing nicely but not wholly sure which one 
OR we are missing a package that needs to be installed, e.g. TCL x,y 
or z.


root@preprod1:/jambuster/src/sqlite/sqlite-autoconf-324# make
/bin/bash ./libtool  --tag=CC   --mode=compile gcc 
-DPACKAGE_NAME=\"sqlite\" -DPACKAGE_TARNAME=\"sqlite\" 
-DPACKAGE_VERSION=\"3.24.0\" -DPACKAGE_STRING=\"sqlite\ 3.24.0\" 
-DPACKAGE_BUGREPORT=\"http://www.sqlite.org\; -DPACKAGE_URL=\"\" 
-DPACKAGE=\"sqlite\" -DVERSION=\"3.24.0\" -DSTDC_HEADERS=1 
-DHAVE_SYS_TYPES_H=1 -DHAVE_SYS_STAT_H=1 -DHAVE_STDLIB_H=1 
-DHAVE_STRING_H=1 -DHAVE_MEMORY_H=1 -DHAVE_STRINGS_H=1 
-DHAVE_INTTYPES_H=1 -DHAVE_STDINT_H=1 -DHAVE_UNISTD_H=1 
-DHAVE_DLFCN_H=1 -DLT_OBJDIR=\".libs/\" -DHAVE_FDATASYNC=1 
-DHAVE_USLEEP=1 -DHAVE_LOCALTIME_R=1 -DHAVE_GMTIME_R=1 
-DHAVE_DECL_STRERROR_R=1 -DHAVE_STRERROR_R=1 -DHAVE_POSIX_FALLOCATE=1 
-DHAVE_ZLIB_H=1 -I.-D_REENTRANT=1 -DSQLITE_THREADSAFE=1  
-DSQLITE_ENABLE_FTS5 -DSQLITE_ENABLE_JSON1 -DSQLITE_HAVE_ZLIB  
-DSQLITE_ENABLE_FTS3 -DSQLITE_ENABLE_RTREE  -g -O2 -MT sqlite3.lo -MD 
-MP -MF .deps/sqlite3.Tpo -c -o sqlite3.lo sqlite3.c
libtool: compile:  gcc -DPACKAGE_NAME=\"sqlite\"

Re: [sqlite] I'm getting a compiler error on a SQLIte build on Ubuntu 16.04 (which surprises me)

2018-07-09 Thread Rob Willett


Clemens,

Thanks. We were up to date anyway, but we did check and we still get the 
same error.


It does look like a compiler bug, also nobody else seems to have 
reported it, which I think is very odd. I cannot be the first person to 
compile this version of SQLite on Ubuntu 16.04 server.


We'll work out how to report it. I still have the same problem of trying 
to get hold of a copy of SQLite_analyze though :( Which is what I wanted 
to do in the first place.


Rob

On 9 Jul 2018, at 11:24, Clemens Ladisch wrote:


rob.sql...@robertwillett.com wrote:

gcc: internal compiler error: Killed (program cc1)


This is a compiler bug.  Check if updating gcc to a current version 
helps.



Please submit a full bug report,
with preprocessed source if appropriate.
See  for instructions.


Otherwise, do this.


Regards,
Clemens
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Is this really the best way to do this?

2018-05-16 Thread Rob Willett


Clemens, Paul S, Abroży,

Thank you for taking the time to reply. I'll look at each of them and 
see what makes the most sense to me.


It's always good to ask the people here as there is a wealth of 
experience.


As suggestions popped in, I kept thinking go the unix competition which 
was to list all the ways to accomplish printing something out, there's 
always another to do it. I will learn here as well.


Many thanks to everybody,

Rob

On 16 May 2018, at 11:25, Clemens Ladisch wrote:


Rob Willett wrote:

select CASE
WHEN EXISTS (select 1 from blocked where email = 
'rwillett.dr...@example.com')
THEN (select action from blocked where email = 
'rwillett.dr...@example.com')

ELSE 'OK'
END


SELECT action FROM blocked WHERE email = ?
UNION ALL
SELECT 'OK'
LIMIT 1;


Regards,
Clemens
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] Is this really the best way to do this?

2018-05-16 Thread Rob Willett


Hi,

I'm experimenting with an email server, Mail In a Box. It's a free 
product and gives you a mail server in a box with SSL certificates, 
multiple domains and seems to work well.


One thing it doesn't do is allow people to specify emails to block.

It uses SQLite to store the underlying data necessary to power Postfix. 
Dr Richard Hipp, add another new application to your list :)


I've worked out how Postfix calls SQLite (pretty easy), worked out what 
I want to do, and am trying to write a single line of SQL that does it.


The need is that the SQL must generate a single string answer depending 
on the email address that is fed in through the query. There are no 
options here, it must return a value even if there is nothing in the 
table. It is not practical to add a table with every email address that 
returns OK.


For this example the only strings it can return are 'OK' and 'DISCARD', 
though the RFC allows other strings. If a query is done on an email and 
it is blocked then it must return DISCARD (or an action in the action 
column. If no email is in the table matching then it must return 'OK'.


As a piece of pseudo code this would ne

function GetMailStatus(emailAddress)
{
IF emailAddress is present in blocked THEN
		return action associated with emailAddress -- Action is normally 
DISCARD


 return 'OK'
}

I've created the table

CREATE TABLE blocked (id INTEGER PRIMARY KEY AUTOINCREMENT,
   email TEXT NOT NULL UNIQUE ,
  action TEXT NOT NULL DEFAULT 'DISCARD')

I can insert values

insert into blocked (email) values('rwillett.dr...@example.com')
insert into blocked (email) values('rwillett+dr...@example.com')

And this is the code that I have written that works but it looks poor to 
my untutored eyes


select CASE
WHEN EXISTS (select 1 from blocked where email = 
'rwillett.dr...@example.com')
THEN (select action from blocked where email = 
'rwillett.dr...@example.com')

ELSE 'OK'
END


In the Postfix query I'll replace

email = 'rwillett.dr...@example.com'

with email = '%s'

so that the email address is passed in.

My worry is that the query has two selects and 'feels' bad. It works but 
feels dirty...


Is there a better way under SQLite?

Any suggestions welcomed.

Rob
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] About storage of large amounts of image data

2018-05-08 Thread Rob Willett


Mike,

We use SQLite to store BLOB data to do with polygons and R-tree's for 
mapping traffic data. We're running 70-90GB (not looked recently) and 
the size has never been an issue for us (which is why we've not looked).


However binary images might start to get towards the field size limit 
which is defined as 10^9 (from http://www.sqlite.org/limits.html). Now 
1GB might be considered large at the moment, but as image files, 
especially RAW images, creep up due to sensor size, this may be too 
small in a few years time. I know that some people are playing with 
multi gigabyte images from very, very hi-res cameras and stitching stuff 
together, which would get close to this. However this is a compile time 
option so you could move it from 1GB to 10GB and unless you are into 
serious astronomy that *might* be enough for a while.


I'd be surprised if the images you have are going to seriously stress 
SQLite to be honest. Now if you are looking for a Lightroom replacement, 
that would be an interesting project and I'd love to hear more :)


SQLite advertises itself as a file system replacement (for some uses). 
This could well be one of the use cases.


The thing I would consider is how to backup your multi-terabyte image 
database. Might take some time


All the best,

Rob


On 8 May 2018, at 14:08, Mike Clark wrote:


 Hi List!

I'm developing a project that deals with image files and am 
considering
storing all the images in the SQLite database itself, rather than (or 
in
addition to) the file system.  Since the prospective users will 
probably be

dealing with hundreds of gigabytes in their use of the project, I am
wondering if this is an effective or efficient use of SQLite -- or 
safe,

because of the risk of data corruption.

I know the documentation says that SQLite can handle up to 140 TB (do 
we
know of anyone who is doing this?), so hundreds of gigs is clearly 
doable.


Is it advisable, however?

(Sent this about 2 wks ago, but it never posted to the list and no
moderator response either).

--
Mike Clark
Twitter: @Cyberherbalist
Blog: Cyberherbalist's Blog 
-
"Free will, though it makes evil possible, is also the only thing that
makes possible any love or goodness or joy worth having."
*- C. S. Lewis*
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] How many AUTOINCREMENT tables are in your schema?

2018-03-17 Thread Rob Willett


We have 17 in one database and 1 in another.

Rob

On 17 Mar 2018, at 11:43, x wrote:


0 in my first sqlite3 DB




From: sqlite-users  on 
behalf of Jean-Christophe Deschamps 

Sent: Saturday, March 17, 2018 9:04:22 AM
To: SQLite mailing list
Subject: Re: [sqlite] How many AUTOINCREMENT tables are in your 
schema?




How many tables in your schema(s) use AUTOINCREMENT?


Maybe a dozen uses in DB designs I made for my own use. Zero or very
few in designs made for others.

My use case of autoincrement is certainly somehow peculiar.

For some of DBs I use daily for managing my own business (I'm
self-employed) I didn't feel the need to write ad-hoc applications and
I only use a third-party SQLite manager (SQlite Expert is open 24/7).
Of course such use demands real care and a lot of fancy constraints,
triggers, add-on functions, external procedures, etc to be workable.
This isn't scalable nor usable by anyone else.

Several of my tables are best viewed/edited as LIFOs: the more recent
entries on top. For instance a table of orders, where you prefer 
recent

entries to be on top of the table when viewed by "natural" (ID) order.
To achieve that effect I use autoincrement and triggers which negate
the rowid alias at insert. The sqlite-sequence entry is handy to make
new row get an unused ID which, once negated, will show first when
viewing the table, albeit there is no more any non-negative ID in the
actual table.

I wouldn't have the use of autoincrement if my DB manager had a
settable ORDER BY clause for basic table viewing/editing.


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Move to Github!!?

2017-12-27 Thread Rob Willett


Dear all,

I've enjoyed reading about the Github vs Fossil discussion. There's been 
some passionate and provocative statements :)


What its done for me is make me think again about how we manage our 
source code control which is always good. We very recently moved from 
our existing text based RCS system to a privately hosted Git system. I 
know RCS is possibly older than the average reader on this mailing list, 
but it works very well for us and the workflow 'just works'. Until 
recently we haven't needed anything more so why change.


RCS started showing its age as we have moved to more and more people 
working out of the office more and more, often with no internet access, 
and we became more distributed. So after an extensive checking of the 
various SCM systems out there, which lasted almost a whole hour, we 
decided to try out Git on a private server. Private servers are pretty 
cheap and we didn't want to pay Github. We were also acutely aware that 
business models change on a whim and we didn't want to be tied into 
Github financially (small though it might be) and technically.


We have spent a month looking at how we use Git and we don't feel a 100% 
happy with it, mind you 75M other people seem to think its OK, but this 
mailing list started us thinking about Fossil. This mailing list has 
always been a source of accurate (and sometimes passionate) information. 
It has rarely (never?) had much 'fluff' in it. (Fluff is a English 
phrase meaning ephemeral or lightweight). Whilst I may not agree with 
everything or even understand most of the SQL, the quality is very, very 
high and when people state something, its normally worth listening to. 
We tend to lurk rather than contribute as other people know SQLite far, 
far better than we do. If you have any Perl questions please ask :)


It's pretty easy to setup Git on a VPS, took an hour or so before we had 
it going and it started well. What we then found was that the 
documentation and the commands were exceptionally confusing. It appeared 
to us that most of the documentation seems self referential and we kept 
going round and round in circles trying to understand why something was 
written the way it was. Even though 75M people seem to use Git, it was a 
struggle to understand it. We except that it might be us, but we felt 
like we were walking through treacle.


The next issue we found was that large binary files can be problematic. 
We have started making Twitter video ads to promote our apps. In the 
interest of fairness and not to self advertise I won't promote them here 
:) However we have 20GB of video and pictures we've created over the 
last few months. It's amazing how much space (and time) can be taken up 
with this sort of thing. We found that by itself Git doesn't really like 
binary files that much and that you tend to need an extension to handle 
Large Binary Files. There also appears to be different extensions to 
handle this sort of stuff which is painful to work out which is the best 
system. It could also be that we simply haven't got the hang of Git.


So we started looking again at Fossil and like what we saw. We have no 
emotional commitment to Git, we want an easy to use SCM that works with 
Emacs (I know, I know), and runs on Linux and Macs. Fossil looks like it 
might fit the bill. There's an Emacs mode so that box is ticked, there's 
a command line mode, so thats easy, the command set seems to be 
orthogonal which is important.


The one area we struggled with is how we might handle large binary files 
(video files). These are made in Final Cut Pro X (FCPX), please don't 
get us started on the backup capability of FCPX,, Screenflow, an 
excellent video editing system that doesn't try to do what FCPX does, 
but does 'simple' videos very well and is fantastically quick to work 
with, Affinity Designer and Affinity Photo. None of these really produce 
XML files so are genuine binary files. It appears that Fossil seems to 
handle binary files 
(https://www.fossil-scm.org/index.html/doc/trunk/www/fileformat.wiki) 
but we struggled to work out how these really work. We'd be very 
interested in peoples experiences here with Fossil with large binary 
files (300MB). It would be great to have a single repository where we 
can store text files, binary files and have these managed easily.


Am aware that this discussion might be better on the Fossil mailing 
list, but since there's so much information on here, we thought we'd ask 
this group first.


Thanks very much for reading,

Rob

On 26 Dec 2017, at 22:47, Jungle Boogie wrote:


On Tue 26 Dec 2017  3:08 PM, Damien Sykes wrote:

Hi,
This is a question I have asked myself many times (I.E. Git projects 
moving

to Fossil).
GitHub is well known and boasts over 74 million repositories, yet 
Fossil,
which both hosts and utilises one of the most well-known database 
systems in
the world, I doubt can count a thousand. Even the ChiselApp hosting 
platform
hosts a mere 360 public

Re: [sqlite] Can't find the source to sqldiff - And yes I have looked :)

2017-09-19 Thread Rob Willett

Thanks,

We have found it under sqlite-src-3200100. We clearly missed it.

Apologies

Rob

On 19 Sep 2017, at 9:18, Rowan Worth wrote:

> I don't have sqlite-src-3200100.zip on hand but it seems to be there in
> tool/sqldiff.c for sqlite-src-3081002 (and in the fossil repo).
>
> -Rowan
>
> On 19 September 2017 at 15:58, Rob Willett <rob.sql...@robertwillett.com>
> wrote:
>
>> Hi,
>>
>> I'm trying to find the source to sqldiff and am struggling to locate it.
>>
>> I've looked on
>>
>> https://sqlite.org/download.html
>>
>> and downloaded
>>
>> sqlite-amalgamation-3200100.zip
>> sqlite-autoconf-3200100.tar.gz
>> sqlite-src-3200100.zip
>>
>> unzipped (or gunzipped) them all and still can't see sqldiff.
>>
>> I've also Googled for it and can see some tart replies saying its on the
>> sqlite download page. I can see that this page says its included in the
>> official distribution but I can't find it
>>
>> http://sqlite.1065341.n5.nabble.com/sqlitediff-td82394.html
>>
>> and can see from Richard Hipp that it *might* be here
>>
>> https://www.sqlite.org/src/artifact/0748c0daed08f31e
>>
>> I can't believe that the above link is the only way to get the sqldiff
>> program and is this the latest and greatest.
>>
>> Have I missed something important or a link somewhere? I've been looking
>> for the last hour and am struggling here to find something that should be
>> reasonably obvious.
>>
>> I need the source as I need to compile a 64bit version for a Linux box
>> that doesn't have the 32bit libraries on. Loading the 32 bit libraries is
>> not an option for us.
>>
>> Any help or suggestions welcomed.
>>
>> Thanks
>>
>> Rob
>> ___
>> sqlite-users mailing list
>> sqlite-users@mailinglists.sqlite.org
>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>>
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] Can't find the source to sqldiff - And yes I have looked :)

2017-09-19 Thread Rob Willett


Hi,

I'm trying to find the source to sqldiff and am struggling to locate it.

I've looked on

https://sqlite.org/download.html

and downloaded

sqlite-amalgamation-3200100.zip
sqlite-autoconf-3200100.tar.gz
sqlite-src-3200100.zip

unzipped (or gunzipped) them all and still can't see sqldiff.

I've also Googled for it and can see some tart replies saying its on the 
sqlite download page. I can see that this page says its included in the 
official distribution but I can't find it


http://sqlite.1065341.n5.nabble.com/sqlitediff-td82394.html

and can see from Richard Hipp that it *might* be here

https://www.sqlite.org/src/artifact/0748c0daed08f31e

I can't believe that the above link is the only way to get the sqldiff 
program and is this the latest and greatest.


Have I missed something important or a link somewhere? I've been looking 
for the last hour and am struggling here to find something that should 
be reasonably obvious.


I need the source as I need to compile a 64bit version for a Linux box 
that doesn't have the 32bit libraries on. Loading the 32 bit libraries 
is not an option for us.


Any help or suggestions welcomed.

Thanks

Rob
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] SQLite Port/ODBC Driver Question

2017-07-10 Thread Rob Willett


It depends on what you mean remotely.

By itself SQLite doesn't have any networking library built in. It's an 
embedded database.


You can put application wrappers around the database, I believe that 
wrappers exist to make SQLIte into a true client/server but thats 
additional code. Also there is ODBC, but there's nothing (AFAIK) in the 
actual codebase itself that allows any remote connectivity.


Clearly you can put Apache/Nginx/PHP/SQlite into a software stack and 
make it work, we actually use Nginx/Mojolicious/SQLite as our platform 
stack but there's nothing in there that allows any remote access to 
SQLite.


If you are talking about hosting the database on a network volume, I 
would recommend that you read this


https://sqlite.org/whentouse.html

The very first paragraph states what Sqlite can do. I would also pay 
close attention to
"If there are many client programs sending SQL to the same database over 
a network, then use a client/server database engine instead of SQLite. 
SQLite will work over a network filesystem, but because of the latency 
associated with most network filesystems, performance will not be great. 
Also, file locking logic is buggy in many network filesystem 
implementations (on both Unix and Windows). If file locking does not 
work correctly, two or more clients might try to modify the same part of 
the same database at the same time, resulting in corruption. Because 
this problem results from bugs in the underlying filesystem 
implementation, there is nothing SQLite can do to prevent it.


A good rule of thumb is to avoid using SQLite in situations where the 
same database will be accessed directly (without an intervening 
application server) and simultaneously from many computers over a 
network."


Just my 2p worth,

Rob

On 10 Jul 2017, at 14:14, Igor Korot wrote:


Rob,

On Mon, Jul 10, 2017 at 7:06 AM, Rob Willett
<rob.sql...@robertwillett.com> wrote:

Vishal,

SQLite isn't a traditional client/server relational database, 
therefore

there isn't a port to open up. It runs on a local machine.


I believe SQLite can successfully be run remotely.

Thank you.



Now there are wrappers around SQLite to extend it, I assume this ODBC 
driver

is one of them.

I suspect people here *may* know the answer regarding any ports the 
ODBC
driver uses, but you may be better off asking the maintainer of the 
ODBC

driver.

Rob


On 10 Jul 2017, at 1:31, Shukla, Vishal wrote:


Hi,
Am trying to open a firewall to the machine having sqlite database. 
Does
the SQLite database use a specific port number ? If not, then does 
the ODBC

connection to SQLite using ODBC driver use a port ?

Any help will be greatly appreciated.

SQLite ODBC Driver:
http://www.ch-werner.de/sqliteodbc/sqliteodbc.exe

Regards,
Vishal Shukla

Confidential communication
Westpac Banking Corporation (ABN 33 007 457 141)
Westpac Institutional Bank is a division of Westpac Banking 
Corporation

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] SQLite Port/ODBC Driver Question

2017-07-10 Thread Rob Willett


Vishal,

SQLite isn't a traditional client/server relational database, therefore 
there isn't a port to open up. It runs on a local machine.


Now there are wrappers around SQLite to extend it, I assume this ODBC 
driver is one of them.


I suspect people here *may* know the answer regarding any ports the ODBC 
driver uses, but you may be better off asking the maintainer of the ODBC 
driver.


Rob

On 10 Jul 2017, at 1:31, Shukla, Vishal wrote:


Hi,
Am trying to open a firewall to the machine having sqlite database. 
Does the SQLite database use a specific port number ? If not, then 
does the ODBC connection to SQLite using ODBC driver use a port ?


Any help will be greatly appreciated.

SQLite ODBC Driver:
http://www.ch-werner.de/sqliteodbc/sqliteodbc.exe

Regards,
Vishal Shukla

Confidential communication
Westpac Banking Corporation (ABN 33 007 457 141)
Westpac Institutional Bank is a division of Westpac Banking 
Corporation

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] -shm and -wal files left after trivially simple select statement from command line

2017-05-15 Thread Rob Willett


Richard, J Decker,

Just to close this down.

The command was

select * from device;

We used .quit and Ctrl-D to exit the shell

The command sequence was as simple as

sqlite3 DATABASE.sqlite

We knew about things like less and head and so we avoided them.

The version of Sqlite seems to be the problem, the standard one with 
Ubuntu 14.04 LTS is 3.8.2 which we know is old, but until now seems to 
have been fine.


Upgrading our production servers to Ubuntu 16 LTS is a massive 
undertaking. We have nine of them with test and preprod so upgrading and 
testing everything again is many, many weeks work :(


We've upgraded Sqlite CLI to the latest 3.18.0 and are trying this out 
in a nice safe environment.


Thanks

Rob

On 15 May 2017, at 12:38, Richard Hipp wrote:


On 5/15/17, Richard Hipp <d...@sqlite.org> wrote:

On 5/15/17, Rob Willett <rob.sql...@robertwillett.com> wrote:


As part of the testing we noticed that -shm and -wal files were 
being
left after we used sqlite3 on the command line. This puzzled us as 
we

didn't see any errors in our test scripts.


How is the command being ended.  Are you inserting a ".quit" command?
Are you somehow causing the shell to terminate prematurely, perhaps 
by
piping the output into a filter (such as "head") that closes the 
input

connection early?  What version of the SQLite shell is running?  What
is the query?


Never mind - it appears that JDecker already knew the problem and the
solution

--
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] -shm and -wal files left after trivially simple select statement from command line

2017-05-15 Thread Rob Willett

Thanks for this. This seems to be the old version of SQLite that comes 
with Ubuntu. We'll upgrade.


Just to be clear we do close the connection as all we do is open the 
database using sqlite3 from the command line, do a select and close 
sqlite3 from the command line.


We do not do anything that modifies the database.

We'll have to upgrade the version of SQLite, test and then upgrade our 
database.


Thanks

Rob

On 15 May 2017, at 12:09, J Decker wrote:

that was an old issue... current sqlite version do not have this 
issue.

You MUST close the connection though, for the files to get deleted.

http://sqlite.1065341.n5.nabble.com/journal-files-not-always-removed-td83700.html#a83705

On Mon, May 15, 2017 at 3:01 AM, Rob Willett 
<rob.sql...@robertwillett.com>

wrote:


Hi,

We've encountered what we think is an odd situation and we can't find 
any
explanation for why this is. We're also not sure if its a problem or 
not.


A brief summary is that we are doing a major database upgrade and are
doing (lots) of testing. The box is a Linux box running Ubuntu 14.04 
LTS
Server. This is a standard box we have used for a few years, we 
*might*

move to 16 LTS shortly.

As part of the testing we noticed that -shm and -wal files were being 
left
after we used sqlite3 on the command line. This puzzled us as we 
didn't see

any errors in our test scripts.

We then narrowed this problem down to doing a simple SELECT statement 
on a

table.

e.g. This is our starting state. Note that nothing is touching these
database as they are isolated in a directory, so there is no other 
process

playing with them.

root@demonotifications:/jambuster/notifications/upgrade_test# ls -lt
total 66892
-rw-r--r-- 1 root root 34217984 May 15 10:38 accountmanagement.sqlite
-rw-r--r-- 1 root root58567 May 15 10:31 upgradedb.sql
-rw-r--r-- 1 root root 34217984 May 13 20:38 
accountmanagement.sqlite.orig


root@demonotifications:/jambuster/notifications/upgrade_test# cp
accountmanagement.sqlite.orig accountmanagement.sqlite

root@demonotifications:/jambuster/notifications/upgrade_test# sqlite3
accountmanagement.sqlite
-- Loading resources from /root/.sqliterc

SQLite version 3.8.2 2013-12-06 14:53:30
Enter ".help" for instructions
Enter SQL statements terminated with a ";"
sqlite> select * from device;
UUID isValid 
Version

   WhatAmI Application
---  --
--  --  --
<  1   1.1
   ios_app JambusterForLondon
.. Hundreds more lines of device data which looks correct
<  1   1.1
   ios_app JambusterForLondon

sqlite> .exit
root@demonotifications:/jambuster/notifications/upgrade_test# ls -lt
total 66924
-rw-r--r-- 1 root root32768 May 15 10:48 
accountmanagement.sqlite-shm
-rw-r--r-- 1 root root0 May 15 10:48 
accountmanagement.sqlite-wal

-rw-r--r-- 1 root root 34217984 May 15 10:47 accountmanagement.sqlite
-rw-r--r-- 1 root root58567 May 15 10:31 upgradedb.sql
-rw-r--r-- 1 root root 34217984 May 13 20:38 
accountmanagement.sqlite.orig

root@demonotifications:/jambuster/notifications/upgrade_test


As can be seen above, we have accountmanagement.sqlite-shm and
accountmanagement.sqlite-wal files left which is odd. All of this 
from a

select statement.

If we do an pragma integrity check we get

  integrity_check
  ---
  ok

All the reading we have done seems to indicate that having -shm and 
-wal
files indicates an error, if this was our code, we'd be looking deep 
into
that but a simple select statement shouldn't cause this sort or 
error, can

it?

The select statement has a number of foreign key delete cascades but 
thats
it, and as we are not deleting anything this shouldn't be an issue 
anyway.


Do we even have a problem? Going back into the database and closing 
it
again, just with the command line utility sqlite3 doesn't clear the 
files.


Any help or suggestions welcomed.

Thanks

Rob
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] -shm and -wal files left after trivially simple select statement from command line

2017-05-15 Thread Rob Willett


Hi,

We've encountered what we think is an odd situation and we can't find 
any explanation for why this is. We're also not sure if its a problem or 
not.


A brief summary is that we are doing a major database upgrade and are 
doing (lots) of testing. The box is a Linux box running Ubuntu 14.04 LTS 
Server. This is a standard box we have used for a few years, we *might* 
move to 16 LTS shortly.


As part of the testing we noticed that -shm and -wal files were being 
left after we used sqlite3 on the command line. This puzzled us as we 
didn't see any errors in our test scripts.


We then narrowed this problem down to doing a simple SELECT statement on 
a table.


e.g. This is our starting state. Note that nothing is touching these 
database as they are isolated in a directory, so there is no other 
process playing with them.


root@demonotifications:/jambuster/notifications/upgrade_test# ls -lt
total 66892
-rw-r--r-- 1 root root 34217984 May 15 10:38 accountmanagement.sqlite
-rw-r--r-- 1 root root58567 May 15 10:31 upgradedb.sql
-rw-r--r-- 1 root root 34217984 May 13 20:38 
accountmanagement.sqlite.orig


root@demonotifications:/jambuster/notifications/upgrade_test# cp 
accountmanagement.sqlite.orig accountmanagement.sqlite


root@demonotifications:/jambuster/notifications/upgrade_test# sqlite3 
accountmanagement.sqlite

-- Loading resources from /root/.sqliterc

SQLite version 3.8.2 2013-12-06 14:53:30
Enter ".help" for instructions
Enter SQL statements terminated with a ";"
sqlite> select * from device;
UUID isValid Version 
WhatAmI Application
---  --  
--  --  --
<  1   1.1 
ios_app JambusterForLondon

.. Hundreds more lines of device data which looks correct
<  1   1.1 
ios_app JambusterForLondon


sqlite> .exit
root@demonotifications:/jambuster/notifications/upgrade_test# ls -lt
total 66924
-rw-r--r-- 1 root root32768 May 15 10:48 
accountmanagement.sqlite-shm
-rw-r--r-- 1 root root0 May 15 10:48 
accountmanagement.sqlite-wal

-rw-r--r-- 1 root root 34217984 May 15 10:47 accountmanagement.sqlite
-rw-r--r-- 1 root root58567 May 15 10:31 upgradedb.sql
-rw-r--r-- 1 root root 34217984 May 13 20:38 
accountmanagement.sqlite.orig

root@demonotifications:/jambuster/notifications/upgrade_test


As can be seen above, we have accountmanagement.sqlite-shm and 
accountmanagement.sqlite-wal files left which is odd. All of this from a 
select statement.


If we do an pragma integrity check we get

  integrity_check
  ---
  ok

All the reading we have done seems to indicate that having -shm and -wal 
files indicates an error, if this was our code, we'd be looking deep 
into that but a simple select statement shouldn't cause this sort or 
error, can it?


The select statement has a number of foreign key delete cascades but 
thats it, and as we are not deleting anything this shouldn't be an issue 
anyway.


Do we even have a problem? Going back into the database and closing it 
again, just with the command line utility sqlite3 doesn't clear the 
files.


Any help or suggestions welcomed.

Thanks

Rob
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Yet Another Why Doesn't Sqlite Use My Index question ...

2017-03-20 Thread Rob Willett


Just to close this thread down completely.

We've implemented the changes on our production server and the actual 
performance increase is even better than we predicted.


Our production server is a containerised Ubuntu Server. Its hardware 
configuration is significantly different to our Macs on which we 
develop. Our Macs have more RAM and SSD disks. The production server has 
other advantages though :)


The query took 90 mins on our production server, this is mainly due to 
slower disks than we have locally, so this was not unexpected.


After doing the performance changes (removing unneeded fields and adding 
in a covering index), we went down to 38 secs.


This has meant we are no longer looking at a second database just for 
analytics, we just needed to learn to use the database we already had :)


Rob

On 18 Mar 2017, at 5:48, Rob Willett wrote:

We've just implemented a covering index for the last step (is it 
really?) in our quest to get the query execution time down.


To summarise we have gone from 32 mins to 16 mins by updating an index 
so it doesn't use collate, we took another six minutes off by removing 
extra fields in the select we didn't need.


We have just created a new index which 'covers' all the fields we use 
in the select, this means (and I paraphrase) that we use the index to 
get all the data and there is no need to read from the database.


Well that was a bit of a surprise, the index creation took 45 mins, we 
ran the program again and thought, rats, we've cocked it up, it only 
took 54 secs, we got something wrong. So we checked it and checked 
again and we hasn't got anything wrong. Our query has moved from 32 
mins to 54 secs.


We're quite happy with that performance increase. In fact we're 
delighted, so thanks for all the help in getting us to this stage.


We have kept copies of the query planner bytecode output if anybody is 
interested. Gunter has had copies, but if anybody else would like 
them, please ask.


Many thanks again for all the help,

Rob

On 17 Mar 2017, at 22:12, Rob Willett wrote:


Dear all,

We progress steadily forward.

1. We immediately halved our execution time by moving to an updated 
index that doesn't use COLLATE. Thats 32mins to 16mins.


2. We then shaved a further six minutes off the execution time by 
removing extraneous fields in the select statement, so instead of 
"select * ...", we identified which fields we used and directly 
selected those. So we are now down to 10 mins or 1/3 of when we 
started for, to be honest, virtually no extra work, merely being 
smarter, or rather you being smarter.


3. We have looked through all our indexes and can see that every 
index has a COLLATE against it, even if the column is an integer. We 
have raised a support call with Navicat.


4. The next step is to create a "covering index" to try and get the 
whole of the query into the index. However its 22:11 in London and I 
need to get home.


Thanks very much for the help so far. Tomorrow is more tricky but 
I'll read up on covering indexes to see how to use them,.


Rob

On 17 Mar 2017, at 18:39, Simon Slavin wrote:

On 17 Mar 2017, at 6:22pm, Rob Willett 
<rob.sql...@robertwillett.com> wrote:


4. Work through returning just the columns we actually need from 
our queries. We have a recollection that if we can build an index 
with all the information necessary in it, we can do all the work in 
joins rather than paging out to disk. Is this what you are 
referring to?


It works only where all the columns you need to read are in the same 
table.  The ideal form of a covering index is to have the columns 
listed in this order:


1) columns needed for the WHERE clause
2) columns needed for the ORDER BY clause which aren’t in (1)
3) columns needed to be read which aren’t in (2) or (1)

SQLite detects that all the information it needs for the SELECT is 
available from the index, so it doesn’t bother to read the table 
at all.  This can lead to something like a doubling of speed.  Of 
course, you sacrifice filespace, and making changes to the table 
takes a little longer.


5. Sleep (not exactly sure when) and watch three international 
rugby games tomorrow.


Sleep while waiting for indexes to be created and ANALYZE to work.  
May you see skilled players, creative moves and dramatic play.


Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Yet Another Why Doesn't Sqlite Use My Index question ...

2017-03-20 Thread Rob Willett


David,

We're English and masters of the understatement ;)

Rob

On 20 Mar 2017, at 15:04, David Raymond wrote:


"... Our query has moved from 32 mins to 54 secs.

We're quite happy with that performance increase."

I have to admit that the hearty chuckle which that statement produced 
from me hurt my sore throat. Totally worth it though.

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Yet Another Why Doesn't Sqlite Use My Index question ...

2017-03-17 Thread Rob Willett

We've just implemented a covering index for the last step (is it 
really?) in our quest to get the query execution time down.


To summarise we have gone from 32 mins to 16 mins by updating an index 
so it doesn't use collate, we took another six minutes off by removing 
extra fields in the select we didn't need.


We have just created a new index which 'covers' all the fields we use in 
the select, this means (and I paraphrase) that we use the index to get 
all the data and there is no need to read from the database.


Well that was a bit of a surprise, the index creation took 45 mins, we 
ran the program again and thought, rats, we've cocked it up, it only 
took 54 secs, we got something wrong. So we checked it and checked again 
and we hasn't got anything wrong. Our query has moved from 32 mins to 54 
secs.


We're quite happy with that performance increase. In fact we're 
delighted, so thanks for all the help in getting us to this stage.


We have kept copies of the query planner bytecode output if anybody is 
interested. Gunter has had copies, but if anybody else would like them, 
please ask.


Many thanks again for all the help,

Rob

On 17 Mar 2017, at 22:12, Rob Willett wrote:


Dear all,

We progress steadily forward.

1. We immediately halved our execution time by moving to an updated 
index that doesn't use COLLATE. Thats 32mins to 16mins.


2. We then shaved a further six minutes off the execution time by 
removing extraneous fields in the select statement, so instead of 
"select * ...", we identified which fields we used and directly 
selected those. So we are now down to 10 mins or 1/3 of when we 
started for, to be honest, virtually no extra work, merely being 
smarter, or rather you being smarter.


3. We have looked through all our indexes and can see that every index 
has a COLLATE against it, even if the column is an integer. We have 
raised a support call with Navicat.


4. The next step is to create a "covering index" to try and get the 
whole of the query into the index. However its 22:11 in London and I 
need to get home.


Thanks very much for the help so far. Tomorrow is more tricky but I'll 
read up on covering indexes to see how to use them,.


Rob

On 17 Mar 2017, at 18:39, Simon Slavin wrote:

On 17 Mar 2017, at 6:22pm, Rob Willett <rob.sql...@robertwillett.com> 
wrote:


4. Work through returning just the columns we actually need from our 
queries. We have a recollection that if we can build an index with 
all the information necessary in it, we can do all the work in joins 
rather than paging out to disk. Is this what you are referring to?


It works only where all the columns you need to read are in the same 
table.  The ideal form of a covering index is to have the columns 
listed in this order:


1) columns needed for the WHERE clause
2) columns needed for the ORDER BY clause which aren’t in (1)
3) columns needed to be read which aren’t in (2) or (1)

SQLite detects that all the information it needs for the SELECT is 
available from the index, so it doesn’t bother to read the table at 
all.  This can lead to something like a doubling of speed.  Of 
course, you sacrifice filespace, and making changes to the table 
takes a little longer.


5. Sleep (not exactly sure when) and watch three international rugby 
games tomorrow.


Sleep while waiting for indexes to be created and ANALYZE to work.  
May you see skilled players, creative moves and dramatic play.


Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Yet Another Why Doesn't Sqlite Use My Index question ...

2017-03-17 Thread Rob Willett


Dear all,

We progress steadily forward.

1. We immediately halved our execution time by moving to an updated 
index that doesn't use COLLATE. Thats 32mins to 16mins.


2. We then shaved a further six minutes off the execution time by 
removing extraneous fields in the select statement, so instead of 
"select * ...", we identified which fields we used and directly selected 
those. So we are now down to 10 mins or 1/3 of when we started for, to 
be honest, virtually no extra work, merely being smarter, or rather you 
being smarter.


3. We have looked through all our indexes and can see that every index 
has a COLLATE against it, even if the column is an integer. We have 
raised a support call with Navicat.


4. The next step is to create a "covering index" to try and get the 
whole of the query into the index. However its 22:11 in London and I 
need to get home.


Thanks very much for the help so far. Tomorrow is more tricky but I'll 
read up on covering indexes to see how to use them,.


Rob

On 17 Mar 2017, at 18:39, Simon Slavin wrote:

On 17 Mar 2017, at 6:22pm, Rob Willett <rob.sql...@robertwillett.com> 
wrote:


4. Work through returning just the columns we actually need from our 
queries. We have a recollection that if we can build an index with 
all the information necessary in it, we can do all the work in joins 
rather than paging out to disk. Is this what you are referring to?


It works only where all the columns you need to read are in the same 
table.  The ideal form of a covering index is to have the columns 
listed in this order:


1) columns needed for the WHERE clause
2) columns needed for the ORDER BY clause which aren’t in (1)
3) columns needed to be read which aren’t in (2) or (1)

SQLite detects that all the information it needs for the SELECT is 
available from the index, so it doesn’t bother to read the table at 
all.  This can lead to something like a doubling of speed.  Of course, 
you sacrifice filespace, and making changes to the table takes a 
little longer.


5. Sleep (not exactly sure when) and watch three international rugby 
games tomorrow.


Sleep while waiting for indexes to be created and ANALYZE to work.  
May you see skilled players, creative moves and dramatic play.


Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Yet Another Why Doesn't Sqlite Use My Index question ...

2017-03-17 Thread Rob Willett


Simon,

We're still benchmarking the various changes we've made during the day. 
I'm sitting here watching a tail of a log file waiting for it to finish. 
Ah the joys of the internet startup :)


Our plan of action is

1. Have baseline from which we can work from. We now have that.

2. Create the correct sub-query index and work out the (hopefully) 
increase in speed.


3. Work through all the remaining indexes and check that we have not 
made the same mistake. I know we actually have :( Keep rerunning our 
benchmark so we know if we are actually making a difference.


4. Work through returning just the columns we actually need from our 
queries. We have a recollection that if we can build an index with all 
the information necessary in it, we can do all the work in joins rather 
than paging out to disk. Is this what you are referring to?


5. Sleep (not exactly sure when) and watch three international rugby 
games tomorrow.


Rob

On 17 Mar 2017, at 18:15, Simon Slavin wrote:

On 17 Mar 2017, at 5:30pm, Rob Willett <rob.sql...@robertwillett.com> 
wrote:


echo "select * from Disruptions where status = 2 OR status = 6;" | 
sqlite3 tfl.sqlite > /dev/null


twice and each run as 12 mins. So we were twice as quick, which is 
nice.


Do you actually need all columns ?  If not, then specifying the 
columns you need can lead to a further speedup.  It might be enough 
just to specify the columns you need, but you can achieve further 
increases in speed by making a covering index.  If speed for this 
SELECT is sufficiently important to you, and you don’t actually need 
all columns, post again and we’ll explain further.


Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Yet Another Why Doesn't Sqlite Use My Index question ...

2017-03-17 Thread Rob Willett


Thanks to everybody for their help earlier today.

As promised here's the results of our various tests. Hopefully they may 
be of use to somebody...


We decided to start from a known position and so recreated the original 
index with the collation in it. We know this was sub optimal but its our 
reference point. We have the bytecode output if anybody wants to see it.


CREATE INDEX "Disruptions_idx4" ON Disruptions ("status" COLLATE NOCASE 
ASC);


We ran the the following SQL twice

echo "select * from Disruptions where status = 2 OR status = 6;" | 
sqlite3 tfl.sqlite > /dev/null


and the two runs totalled 46 mins. Each was actually 23 mins.

We then dropped the old index, built the new one

echo 'CREATE INDEX "Disruptions_idx4" ON Disruptions ("status");' | 
sqlite3 tfl.sqlite


We ran

echo "select * from Disruptions where status = 2 OR status = 6;" | 
sqlite3 tfl.sqlite > /dev/null


twice and each run as 12 mins. So we were twice as quick, which is nice.

We then ran

echo "explain select * from Disruptions where status = 2 UNION ALL 
select * from Disruptions where status = 6;" | sqlite3 tfl.sqlite


twice. Each run was around 11.5 mins. We're not going to get into 
differences of less than a minute on a run of this size, so we'll say 
they are about the same speed.


Interesting results, clearly the collation does make a big difference. 
We are now going to go through the schema and check if we have made the 
same mistake elsewhere.


Thanks for your help, we can post the bytecode it people are interested.

Rob

On 17 Mar 2017, at 11:41, Rob Willett wrote:


Gunter,

I would never presume to describe anybody as a Nerd!

We're just going back to very first position with the 'bad' collation 
index so we can do proper timings as we change things so we understand 
the speed up (we hope there is a speed up)


We've written a quick script to check each version. Once we've put the 
original index back in, we've added a step to generate the SQLite 
bytecode for you. It's the least we can do...


We'll post this when its completed but we suspect it may take most of 
the day now :)


echo "Using Index 'CREATE INDEX "Disruptions_idx4" ON Disruptions 
("status" COLLATE NOCASE ASC);'"
echo "explain select * from Disruptions where status = 2 OR status = 
6;" | sqlite3 tfl.sqlite

date
echo "select * from Disruptions where status = 2 OR status = 6;" | 
sqlite3 tfl.sqlite > /dev/null

date
echo "select * from Disruptions where status = 2 OR status = 6;" | 
sqlite3 tfl.sqlite > /dev/null

date

echo "---"

echo "Creating new index without collation"
echo "drop index Disruptions_idx4;" | sqlite3 tfl.sqlite
echo 'CREATE INDEX "Disruptions_idx4" ON Disruptions ("status");' | 
sqlite3 tfl.sqlite
echo "explain select * from Disruptions where status = 2 OR status = 
6;" | sqlite3 tfl.sqlite

date
echo "select * from Disruptions where status = 2 OR status = 6;" | 
sqlite3 tfl.sqlite > /dev/null

date
echo "select * from Disruptions where status = 2 OR status = 6;" | 
sqlite3 tfl.sqlite > /dev/null

date

echo "---"

echo "Trying SELECT statement with UNION ALL"
echo "explain select * from Disruptions where status = 2 OR status = 
6;" | sqlite3 tfl.sqlite

date
echo "select * from Disruptions  where status = 2 UNION ALL select * 
from Disruptions where status = 6;" | sqlite3 tfl.sqlite > /dev/null

date
echo "select * from Disruptions  where status = 2 UNION ALL select * 
from Disruptions where status = 6;" | sqlite3 tfl.sqlite > /dev/null

date


On 17 Mar 2017, at 11:30, Hick Gunter wrote:

Nerds with chronic byte code affinity like myself would like to see 
the output of "explain" (without "query plan"), i.e. the SQLite 
bytecode produced. I guess the query with OR will have a subprogram 
called once for each status value, whereas I expect the query with 
UNION ALL to have 2 copies of the search (which would not affect the 
run time) and maybe even a temporary table of results (which would 
take longer and use more memory).


-Ursprüngliche Nachricht-
Von: sqlite-users 
[mailto:sqlite-users-boun...@mailinglists.sqlite.org] Im Auftrag von 
Rob Willett

Gesendet: Freitag, 17. März 2017 12:19
An: SQLite mailing list <sqlite-users@mailinglists.sqlite.org>
Betreff: Re: [sqlite] Yet Another Why Doesn't Sqlite Use My Index 
question ...


Gunter, Simon,

Thanks for the replies, You both seem to be drilling into the 
collation sequence as a possible issue. We now have a new index and 
we have just run the query again


sqlite> analyze;
sqlite> drop index Disruptions_idx4;
sqlite> CREATE INDEX "Disruptions_idx4" ON Disruptions ("status");
sqlite> explain query plan select *

Re: [sqlite] Yet Another Why Doesn't Sqlite Use My Index question ...

2017-03-17 Thread Rob Willett


Gunter,

I would never presume to describe anybody as a Nerd!

We're just going back to very first position with the 'bad' collation 
index so we can do proper timings as we change things so we understand 
the speed up (we hope there is a speed up)


We've written a quick script to check each version. Once we've put the 
original index back in, we've added a step to generate the SQLite 
bytecode for you. It's the least we can do...


We'll post this when its completed but we suspect it may take most of 
the day now :)


echo "Using Index 'CREATE INDEX "Disruptions_idx4" ON Disruptions 
("status" COLLATE NOCASE ASC);'"
echo "explain select * from Disruptions where status = 2 OR status = 6;" 
| sqlite3 tfl.sqlite

date
echo "select * from Disruptions where status = 2 OR status = 6;" | 
sqlite3 tfl.sqlite > /dev/null

date
echo "select * from Disruptions where status = 2 OR status = 6;" | 
sqlite3 tfl.sqlite > /dev/null

date

echo "---"

echo "Creating new index without collation"
echo "drop index Disruptions_idx4;" | sqlite3 tfl.sqlite
echo 'CREATE INDEX "Disruptions_idx4" ON Disruptions ("status");' | 
sqlite3 tfl.sqlite
echo "explain select * from Disruptions where status = 2 OR status = 6;" 
| sqlite3 tfl.sqlite

date
echo "select * from Disruptions where status = 2 OR status = 6;" | 
sqlite3 tfl.sqlite > /dev/null

date
echo "select * from Disruptions where status = 2 OR status = 6;" | 
sqlite3 tfl.sqlite > /dev/null

date

echo "---"

echo "Trying SELECT statement with UNION ALL"
echo "explain select * from Disruptions where status = 2 OR status = 6;" 
| sqlite3 tfl.sqlite

date
echo "select * from Disruptions  where status = 2 UNION ALL select * 
from Disruptions where status = 6;" | sqlite3 tfl.sqlite > /dev/null

date
echo "select * from Disruptions  where status = 2 UNION ALL select * 
from Disruptions where status = 6;" | sqlite3 tfl.sqlite > /dev/null

date


On 17 Mar 2017, at 11:30, Hick Gunter wrote:

Nerds with chronic byte code affinity like myself would like to see 
the output of "explain" (without "query plan"), i.e. the SQLite 
bytecode produced. I guess the query with OR will have a subprogram 
called once for each status value, whereas I expect the query with 
UNION ALL to have 2 copies of the search (which would not affect the 
run time) and maybe even a temporary table of results (which would 
take longer and use more memory).


-Ursprüngliche Nachricht-
Von: sqlite-users 
[mailto:sqlite-users-boun...@mailinglists.sqlite.org] Im Auftrag von 
Rob Willett

Gesendet: Freitag, 17. März 2017 12:19
An: SQLite mailing list <sqlite-users@mailinglists.sqlite.org>
Betreff: Re: [sqlite] Yet Another Why Doesn't Sqlite Use My Index 
question ...


Gunter, Simon,

Thanks for the replies, You both seem to be drilling into the 
collation sequence as a possible issue. We now have a new index and we 
have just run the query again


sqlite> analyze;
sqlite> drop index Disruptions_idx4;
sqlite> CREATE INDEX "Disruptions_idx4" ON Disruptions ("status");
sqlite> explain query plan select * from Disruptions where status = 2 
OR

status = 6;
selectid|order|from|detail
0|0|0|SEARCH TABLE Disruptions USING INDEX Disruptions_idx4 (status=?)
0|0|0|EXECUTE LIST SUBQUERY 1
sqlite>

So we have a different response from the query planner, which I think 
is good.


If we use your other example

sqlite> explain query plan select * from Disruptions  where status = 2
UNION ALL select * from Disruptions where status = 6;
selectid|order|from|detail
1|0|0|SEARCH TABLE Disruptions USING INDEX Disruptions_idx4 (status=?)
2|0|0|SEARCH TABLE Disruptions USING INDEX Disruptions_idx4 (status=?)
0|0|0|COMPOUND SUBQUERIES 1 AND 2 (UNION ALL)

I'm not sure which query is going to be faster. We'll have to try and 
see.


Your last suggestion of "select * from Disruptions  where status =2 
COLLATE NOCASE or status = 6 COLLATE NOCASE" appears to be logically 
equivalent to "explain query plan select * from Disruptions where 
status = 2 OR status = 6;" now we have removed the collation from the 
index.


sqlite> explain query plan select * from Disruptions  where status =2
COLLATE NOCASE or status = 6 COLLATE NOCASE;
selectid|order|from|detail
0|0|0|SEARCH TABLE Disruptions USING INDEX Disruptions_idx4 (status=?)
0|0|0|EXECUTE LIST SUBQUERY 1
sqlite>

I'll check if we require all the fields, we require many (which I 
agree is not all) of the fields. Following this logic through, does 
this mean that it will do more file access bringing the records in 
from the file system?


The collation issue seems to be an artifact of the way Navcat for 
SQLite works. I suspect we need to be more careful about how we use 
the tool.

Re: [sqlite] Yet Another Why Doesn't Sqlite Use My Index question ...

2017-03-17 Thread Rob Willett


Gunter, Simon,

Thanks for the replies, You both seem to be drilling into the collation 
sequence as a possible issue. We now have a new index and we have just 
run the query again


sqlite> analyze;
sqlite> drop index Disruptions_idx4;
sqlite> CREATE INDEX "Disruptions_idx4" ON Disruptions ("status");
sqlite> explain query plan select * from Disruptions where status = 2 OR 
status = 6;

selectid|order|from|detail
0|0|0|SEARCH TABLE Disruptions USING INDEX Disruptions_idx4 (status=?)
0|0|0|EXECUTE LIST SUBQUERY 1
sqlite>

So we have a different response from the query planner, which I think is 
good.


If we use your other example

sqlite> explain query plan select * from Disruptions  where status = 2 
UNION ALL select * from Disruptions where status = 6;

selectid|order|from|detail
1|0|0|SEARCH TABLE Disruptions USING INDEX Disruptions_idx4 (status=?)
2|0|0|SEARCH TABLE Disruptions USING INDEX Disruptions_idx4 (status=?)
0|0|0|COMPOUND SUBQUERIES 1 AND 2 (UNION ALL)

I'm not sure which query is going to be faster. We'll have to try and 
see.


Your last suggestion of "select * from Disruptions  where status =2 
COLLATE NOCASE or status = 6 COLLATE NOCASE" appears to be logically 
equivalent to "explain query plan select * from Disruptions where status 
= 2 OR status = 6;" now we have removed the collation from the index.


sqlite> explain query plan select * from Disruptions  where status =2 
COLLATE NOCASE or status = 6 COLLATE NOCASE;

selectid|order|from|detail
0|0|0|SEARCH TABLE Disruptions USING INDEX Disruptions_idx4 (status=?)
0|0|0|EXECUTE LIST SUBQUERY 1
sqlite>

I'll check if we require all the fields, we require many (which I agree 
is not all) of the fields. Following this logic through, does this mean 
that it will do more file access bringing the records in from the file 
system?


The collation issue seems to be an artifact of the way Navcat for SQLite 
works. I suspect we need to be more careful about how we use the tool.


We'll now time the results of each query and run them twice to see the 
affect. No idea how long this will take but suspect a few hours :) I 
will post back the results as other people may (or may not) find this 
helpful.


Thanks

Rob

On 17 Mar 2017, at 10:57, Hick Gunter wrote:

On 17 Mar 2017, at 10:20am, Rob Willett 
<rob.sql...@robertwillett.com> wrote:


CREATE INDEX "Disruptions_idx4" ON Disruptions ("status" COLLATE 
NOCASE ASC);


[…]

As part of the larger more complex query, we are executing the query

```
select * from Disruptions where status = 2 OR status = 6;


The schema for the table says that "status" is INTEGER.
You are supplying numbers as arguments.
Those two match and should create no problem.

But your index has a collation order which is usually used for text.  
I don’t see that it is obviously wrong, but it does look a little 
weird.


Try creating another index which is just on "status", without the 
COLLATE clause.

Then do another ANALYZE, then try the SELECT again.

Simon.


If the index is deemed unsuitable by SQLite due to its collation 
sequence, then I expect it qwould also be ignored in "select ... 
status=1" (without the second ORed value)


If not, then (select ... where status =2 UNION ALL select where status 
= 6) should do the trick


Do you really require all the fields from Disruptions?

And yes, collating integers with NOCASE seems quite strange (there are 
no capital or lowercase numbers unless you are using roman numerals ;) 
); for text affinity, it should render the comparison operators 
caseblind, just like "like".



___
 Gunter Hick
Software Engineer
Scientific Games International GmbH
FN 157284 a, HG Wien
Klitschgasse 2-4, A-1130 Vienna, Austria
Tel: +43 1 80100 0
E-Mail: h...@scigames.at

This communication (including any attachments) is intended for the use 
of the intended recipient(s) only and may contain information that is 
confidential, privileged or legally protected. Any unauthorized use or 
dissemination of this communication is strictly prohibited. If you 
have received this communication in error, please immediately notify 
the sender by return e-mail message and delete all copies of the 
original communication. Thank you for your cooperation.



___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Yet Another Why Doesn't Sqlite Use My Index question ...

2017-03-17 Thread Rob Willett


Simon,

Thank you for the quick response. We'll do this. We're just waiting for 
another analyze to finish. Our current DB is 45GB, the query works OK on 
smaller databases, but at this size its very slow, thats why we have a 
bad feeling that we're tinkering on the edges and we need a new DB 
design :(


The reason for the collation is the way NavCat SQLite works, its 
difficult to get it without the collate but we'll do this direct from 
the command line.


We'll post back in an hour or so as it'll probably take that long to run 
:)


Rob

On 17 Mar 2017, at 10:27, Simon Slavin wrote:

On 17 Mar 2017, at 10:20am, Rob Willett <rob.sql...@robertwillett.com> 
wrote:


CREATE INDEX "Disruptions_idx4" ON Disruptions ("status" COLLATE 
NOCASE ASC);


[…]

As part of the larger more complex query, we are executing the query

```
select * from Disruptions where status = 2 OR status = 6;


The schema for the table says that "status" is INTEGER.
You are supplying numbers as arguments.
Those two match and should create no problem.

But your index has a collation order which is usually used for text.  
I don’t see that it is obviously wrong, but it does look a little 
weird.


Try creating another index which is just on "status", without the 
COLLATE clause.

Then do another ANALYZE, then try the SELECT again.

Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] Yet Another Why Doesn't Sqlite Use My Index question ...

2017-03-17 Thread Rob Willett


Hi,

We've been struggling with a complex query that we have written. One of 
the elements of this complex query is a select statement that doesn't 
appear to use an index when we *think* it should do. We're not going to 
ask you to debug a large complex SQL query (unless you have nothing else 
to do today) but we're working our way through the query trying to 
understand where we've gone wrong. So we've broken down the query and 
are trying each section to see what it performs (or rather doesn't 
perform like).


The sub query is a simple select on a large table, Disruptions. The 
Disruptions table has 180M rows of data.


The schema for it is here. We've pulled it straight from Navicat for 
SQLite.


```
CREATE TABLE "Disruptions" (
 "id" INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT,
 "version" integer NOT NULL,
 "Disruption_id" INTEGER NOT NULL,
 "status" integer NOT NULL,
 "severity" integer NOT NULL,
 "levelOfInterest" integer NOT NULL,
 "category" integer NOT NULL,
 "subCategory" integer NOT NULL,
 "startTime" TEXT NOT NULL,
 "endTime" text NOT NULL,
 "location" integer NOT NULL,
 "corridor" integer NOT NULL,
 "comments" integer NOT NULL,
 "currentUpdate" integer NOT NULL,
 "remarkTime" TEXT NOT NULL,
 "lastModTime" TEXT NOT NULL,
 "CauseAreaPointX" real NOT NULL,
 "CauseAreaPointY" real NOT NULL,
 "Direction" TEXT
);
INSERT INTO "main".sqlite_sequence (name, seq) VALUES ("Disruptions", 
'184626834');


-- 
--  Indexes structure for table Disruptions
-- 
CREATE INDEX "Disruptions_Idx1" ON Disruptions ("location" COLLATE 
NOCASE ASC, "corridor" COLLATE NOCASE ASC, "status" COLLATE NOCASE ASC, 
"category" COLLATE NOCASE ASC, "severity" COLLATE NOCASE ASC, 
"levelOfInterest" COLLATE NOCASE ASC, "version" COLLATE NOCASE ASC, 
"subCategory" COLLATE NOCASE ASC, "startTime" COLLATE NOCASE ASC);
CREATE UNIQUE INDEX "Disruptions_Idx2" ON Disruptions ("Disruption_id" 
COLLATE NOCASE ASC, "version" COLLATE NOCASE ASC, "category" COLLATE 
NOCASE ASC, "subCategory" COLLATE NOCASE ASC);
CREATE UNIQUE INDEX "Disruptions_Idx3" ON Disruptions ("version" COLLATE 
NOCASE ASC, "Disruption_id" COLLATE NOCASE ASC, "location" COLLATE 
NOCASE ASC);
CREATE INDEX "Disruptions_idx4" ON Disruptions ("status" COLLATE NOCASE 
ASC);


PRAGMA foreign_keys = true;
```

As part of the larger more complex query, we are executing the query

```
select * from Disruptions where status = 2 OR status = 6;
```

Status is one of six values, 1 to 6 and is probably not evenly 
distributed across the 180M rows.


If we do

```
sqlite> explain query plan select * from Disruptions where status = 2 OR 
status = 6;

selectid|order|from|detail
0|0|0|SCAN TABLE Disruptions
```

We can see that table scanning a 180M records is going to be slow, no 
matter what the rest of the complex query is like.


We have an index Disruptions_idx4 which we *think* should speed it up, 
but the query plan doesn't seem to take this into account.


We think that only having six values of Status means that the speed up 
from the index is not going to be fantastic but every little helps.


We have run analyze on the database and that hasn't helped.

Our gut feeling at this moment is that we have the design structure 
wrong in our DB and we're going to have to take a long hard look at what 
we're doing, do a redesign and a rebuild as we simply got it wrong from 
the beginning. Hindsight is wonderful ;) In the interim (as this is a 
2-3 month job), we need to speed our query up from 90 mins down to 
something in the tens of mins.


Any suggestions very much welcomed,

Thanks

Rob
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] New tool for PUTTY logging [Windows]

2017-02-02 Thread Rob Willett

I've been following this thread with interest. I have used Putty for 
years as its the de-facto standard for decent ssh terminals on Windows 
boxes.


A slightly more radical suggestion for the log files. Since Putty is 
open source, have a look at the code and see if you can easily add in a 
timestamp per line for the log file section.


That gives you two features:

1. You now have a verifiable traceable source as you have downloaded and 
compiled it. I have worked in environments where we need to trace every 
bit of code that comes into the data centre. We need to know where we 
downloaded it from, what the license was, authority from legal to use 
etc etc. Your situation might not warrant it :)


2. You now have unique lines with a timestamp, a hostname and a TTY 
session (I assume). I think that guarantees uniqueness.


I have no idea if Putty can be modified in this way, but it wouldn't 
hurt to have a look, see if the change is easy, do the change and then 
send some diffs back to the Putty team. If they accept the changes 
you're sorted. If they don't well, Putty doesn't change that much over 
time so you could probably use your version for years to come.


Rob

On 2 Feb 2017, at 19:53, Stephen Chrzanowski wrote:

I can only get to our customer machines by jumping into a server that 
has
access to both sides of the network.  Our side, and the customer side. 
 I
can't get to a customers machine directly.  The  is out, but I'm 
already

doing the rest.

The image in my head of what my program is going to do is that I feed 
it a
date range, a server I'm interested in, and optionally provide text 
that
further filters the information I'm looking for.  Once I have the 
filtered
data, I'd have a list of days that I'd been on that exact server, 
and/or

entries that mention my subject server, and I can see the text only
pertaining to that machine and date range.  I'd be able to read the 
full
set of activities on that machine for that day, and not have to hop 
around

to multiple log files..  This would get rid of the concept of many log
files as well, since all files are now one.  Kind of Borg-ish?


On Thu, Feb 2, 2017 at 11:54 AM, Donald Griggs  
wrote:



Maybe another method to consider:

This guy shows that Putty appears to support creating separate log 
files

for each session including a timestamp in the file name.

https://www.viktorious.nl/2013/01/14/putty-log-all-session-output/

Could your script import any new log files it sees, then move them to 
an

archive?

That way, you'd never have to read through a huge log file to find 
what

should be imported.


==From the page linked above:


I am using some putty parameters which will make every session 
unique, in

this case “”, which means:

   -  = hostname for the session
   -  = year
   -  = month
   -  = day
   -  = time

 ==

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] creating a table

2016-11-21 Thread Rob Willett


John,

There is a lot of documentation on the SQLite website.

Here's the 'official' docs on creating a table 
https://www.sqlite.org/lang_createtable.html


A Sqlite database consists of many tables. I am unsure if there is an 
upper limit, if there is, its more tables than I have ever created. You 
probably need to read up a bit more on SQL in general and SQLite in 
particular. Its a great SQL database for many uses, though not for every 
use. Here's a summary https://sqlite.org/whentouse.


SQLite does not set limits on text fields. Thats a great strength (some 
people may disagree), see here https://www.sqlite.org/datatype3.html


You can create a table and specify the text length but its there for 
compatibility and is ignored. You want to drop 20, 200 or 2000 chars in 
your field, go ahead and do it.


SQLite is very flexible and very fast, there's a ton of help and docs 
out there, the support is direct from the people who write it.


Rob


On 21 Nov 2016, at 17:29, John R. Sowden wrote:

First of all, I come from the dBASE/Foxpro world. There is no 
distinction between a table and a database.  I understand that with 
Sqlite a database includes tables and other items.  The scenario that 
I do not understand, is: say I have a log file with about 7 fields 
totaling about 80 characters per record.  How do I name the database 
and table.  Currently I say log16 for the year 2016.


Secondly, I have 2 "front ends" for Sqlite on my Ubuntu 16.04 
computer.  Neither one allows me to set the length of the text fields 
in the table creation process.  How does the Sqlite know how long each 
record should be, same with integers.


No help found in the documentation on the Sqlite web site.

John

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Segmentation fault on OpenBSD

2016-11-01 Thread Rob Willett


Simon,

We use the Perl DBD module all the time. What I would expect to see is 
(IGNORE THE LINE BREAKS)


my $sth = $dbh->prepare("INSERT INTO
func_begin_change(
author,
author_contact,
author_contact_method,
author_shortname,
id,
lang,
mtime,
mtimetz,
parent_id
)
VALUES
(
?,
?,
?,
?,
?,
?,
?,
?,
?
)");

followed by

$sth->execute('Mark Lawrence',
  'em...@address.net',
  'email',
  'ML',
  13,
  'en',
  '1478010282201',
  360,
  undef);

The prepare sets the statement up and the execute fills the variables 
in.


Now Mark has probably printed out using Dumper an array in Perl with the 
variables. Thats shown by the use of $VAR1. Anybody who uses Perl uses 
Dumper and is very familiar with the output format :)


Here's what the Perl array could/would/should look like

	['Mark 
Lawrence','em...@address.net','email','ML',13,'en','1478010282201',360,undef]


Now he could call $sth->execute_array(\%attar) rather than 
$sth->execute(var1 , var2) see 
http://search.cpan.org/~timb/DBI-1.636/DBI.pm#execute


We've just checked our Perl libraries and we have never used 
$sth->execute_array as we prefer to be very specific about what we pass 
in.


It would be useful to see the actual Perl itself rather than the 
abstraction in the email. I can't comment on BSD as we don't use it BSD 
at all, but the coding pattern in Perl is pretty standard and very, very 
common. I'd be surprised if this is a problem but you never know.


Rob


On 1 Nov 2016, at 14:57, Simon Slavin wrote:




On 1 Nov 2016, at 2:38pm, mark  wrote:

   VALUES
   (
   ?,
   ?,
   ?,
   ?,
   ?,
   ?,
   ?,
   ?,
   ?
   )
   ;

At least that is what I am passing to Perl's DBD::SQLite prepare()
method. I am then binding the following values before running 
execute:


   $VAR1 = [
 'Mark Lawrence',
 'em...@address.net',
 'email',
 'ML',
 13,
 'en',
 '1478010282201',
 360,
 undef
   ];


Excuse me.  I don't know Perl, or how it uses its DBD module.  Are you 
binding each of those values (e.g. 'ML') separately, or are you 
binding the whole of $VAR1 as one operation ?


Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Database malformed after 6000 Inserts?

2016-10-04 Thread Rob Willett


We don't use Windows Server or System.Data.SQLite so can't comment.

I'd be astonished if its Sqlite itself thats at fault here.

Rob

On 4 Oct 2016, at 13:24, Werner Kleiner wrote:


Thanks for help.
Hopefully I give you the correct answer, because a collegue has
written the C# program.
We had no problems with inserts in the past, but now if we have
records about 6000 inserts we get the errors.

The OS is Server 2012, there are no pragma settings (but collegue has
also used with pragma settings)
It is a single thread with "synchronize full"
We use the System.Data.SQLite.dll with version 1.0.89.0 from 
12.12.2013


hope this is what you mean

2016-10-04 13:48 GMT+02:00 Jim Borden :
I had a problem similar to this before. What is the threading model 
for access to the database and how is the native library compiled and 
configured?


Jim Borden
(Sent from a mobile device)

On 4 Oct 2016, at 19:12, Werner Kleiner  
wrote:


Hello,
a program written in C# makes inserts from an SQL script into a 
sqlite db.

We now saw that the database will be malformed after 6000 records.

Is there a limitation with huge inserts?
What could be the problem?

regards
Werner
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Database malformed after 6000 Inserts?

2016-10-04 Thread Rob Willett

We've done inserts of tens of thousand at a time, we may well have done 
hundreds of thousands in one single transaction. I've no doubt other 
people do even larger transactions.


I would assume the problem lies elsewhere.

What error message are you getting? Whats the OS, the environment, disk, 
pragma settings?


Rob

On 4 Oct 2016, at 11:11, Werner Kleiner wrote:


Hello,
a program written in C# makes inserts from an SQL script into a sqlite 
db.

We now saw that the database will be malformed after 6000 records.

Is there a limitation with huge inserts?
What could be the problem?

regards
Werner
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Using too much memory to execute an update query

2016-10-02 Thread Rob Willett

We tend to use append only. We delete some small transient data as we 
use them, this transient data maintains a bit of state between appends. 
We're talking 300-500 lines of a table.


It's not read only, we add a lot of data every 3-5 minutes, but its all 
driven from a single Perl process that captures XML data, processes it 
as only Perl can, and then updates the database. Nothing updates the 
database between those time intervals. No foreign keys at all.


The schema is quite long but rather simplistic. There are lots of tables 
all referenced by a big central table, since a lot of our data is common 
and referenced thousands of times by different rows, we needed to reduce 
the amount of data or else we would have tripled the size of the main 
database to over 100GB.


Its 1,200 lines long so not really suitable for posting here, but the 
main table is


-- 
--  Table structure for Disruptions
-- 
DROP TABLE IF EXISTS "Disruptions";
CREATE TABLE "Disruptions" (
 "id" INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT,
 "version" integer NOT NULL,
 "Disruption_id" INTEGER NOT NULL,
 "status" integer NOT NULL,
 "severity" integer NOT NULL,
 "levelOfInterest" integer NOT NULL,
 "category" integer NOT NULL,
 "subCategory" integer NOT NULL,
 "startTime" TEXT NOT NULL,
 "endTime" text NOT NULL,
 "location" integer NOT NULL,
 "corridor" integer NOT NULL,
 "comments" integer NOT NULL,
 "currentUpdate" integer NOT NULL,
 "remarkTime" TEXT NOT NULL,
 "lastModTime" TEXT NOT NULL,
 "CauseAreaPointX" real NOT NULL,
 "CauseAreaPointY" real NOT NULL,
 "Direction" TEXT
);
INSERT INTO "main".sqlite_sequence (name, seq) VALUES ("Disruptions", 
'0');


All the integer fields are really indexes into other tables. We made a 
deliberate decision to move some of the more complex logic out of SQL 
into Perl as it sat better there. This was no reflection on SQLite, we 
would have made the same decision with PostGres or MySQL or Oracle. 
Right tool for the right job etc etc.


Rob

On 2 Oct 2016, at 22:47, Domingo Alvarez Duarte wrote:


Hello Rob !

How do you use those big databases ?

Append only ? Read only ? Constant read write ? Foreign keys ?

Could you show the schema to talk about it ?

Cheers !

On 02/10/16 17:44, Rob Willett wrote:
We have production databases with 30-50GB and have no issues with 
managing them. Mind you we have more memory, though we only have 8GB 
as thats all our VM provider allows us.


After saying that we have never seen any performance issues that 
weren't due to our schemas or our bad design. Our working assumption 
is that if there's a problem, its our fault before we blame SQLite. 
So far that working assumption had held true :)


We did some tests (just for the hell of it) to 200GB with dummy data 
and had no issues I can recall.


I know that other people on this list have far, far bigger production 
databases than us.


I'm not a SQLite expert but I look on machines with 2GB of main 
memory as rather small for this sort of thing.


Rob

On 2 Oct 2016, at 21:35, Domingo Alvarez Duarte wrote:


Hello !

I'm still fighting to use sqlite with a 20GB database and now I'm 
trying to update a table with 980M rows but sqlite is eating all my 
memory (2GB) and making blocking my computer (too much swap).


I'm in doubt if sqlite can really manage databases above 1GB.

Have someone any experience with big databases ?

Cheers !

The query:

--explain query plan
update bolsas_familia set favorecido_id=(select id from favorecidos 
where nis=favorecido_id);


Query plan:

selectid|order|from|detail
0|0|0|SCAN TABLE bolsas_familia
0|0|0|EXECUTE CORRELATED SCALAR SUBQUERY 0
0|0|0|SEARCH TABLE favorecidos USING COVERING INDEX 
favorecidos_nis_idx (nis=?)


The schema is:

CREATE TABLE bolsas_familia(
id integer primary key,
favorecido_id integer not null,
valor_parcela numeric,
mes_competencia integer
); //980M rows

CREATE TABLE "favorecidos"(
id integer primary key,
nis integer,
nome varchar collate nocase_slna,
municipio_id integer,
programa_id integer not null,
fonte_finalidade_id integer not null,
first_ano_mes integer
); //14M rows

CREATE UNIQUE INDEX favorecidos_nis_idx ON favorecidos(nis);

Cheers !

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mai

Re: [sqlite] Using too much memory to execute an update query

2016-10-02 Thread Rob Willett

We have production databases with 30-50GB and have no issues with 
managing them. Mind you we have more memory, though we only have 8GB as 
thats all our VM provider allows us.


After saying that we have never seen any performance issues that weren't 
due to our schemas or our bad design. Our working assumption is that if 
there's a problem, its our fault before we blame SQLite. So far that 
working assumption had held true :)


We did some tests (just for the hell of it) to 200GB with dummy data and 
had no issues I can recall.


I know that other people on this list have far, far bigger production 
databases than us.


I'm not a SQLite expert but I look on machines with 2GB of main memory 
as rather small for this sort of thing.


Rob

On 2 Oct 2016, at 21:35, Domingo Alvarez Duarte wrote:


Hello !

I'm still fighting to use sqlite with a 20GB database and now I'm 
trying to update a table with 980M rows but sqlite is eating all my 
memory (2GB) and making blocking my computer (too much swap).


I'm in doubt if sqlite can really manage databases above 1GB.

Have someone any experience with big databases ?

Cheers !

The query:

--explain query plan
update bolsas_familia set favorecido_id=(select id from favorecidos 
where nis=favorecido_id);


Query plan:

selectid|order|from|detail
0|0|0|SCAN TABLE bolsas_familia
0|0|0|EXECUTE CORRELATED SCALAR SUBQUERY 0
0|0|0|SEARCH TABLE favorecidos USING COVERING INDEX 
favorecidos_nis_idx (nis=?)


The schema is:

CREATE TABLE bolsas_familia(
id integer primary key,
favorecido_id integer not null,
valor_parcela numeric,
mes_competencia integer
); //980M rows

CREATE TABLE "favorecidos"(
id integer primary key,
nis integer,
nome varchar collate nocase_slna,
municipio_id integer,
programa_id integer not null,
fonte_finalidade_id integer not null,
first_ano_mes integer
); //14M rows

CREATE UNIQUE INDEX favorecidos_nis_idx ON favorecidos(nis);

Cheers !

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Locking databases - Possibly (probably?) a dumb question

2016-08-07 Thread Rob Willett


Jean-Christophe

Thanks for the update on wal-mode. Your explanation is clear and makes 
sense to us. We can see what we would have a 224MB -wal file, we 
experimented with killing processes whilst updating and generally 
messing around and SQLite did what is was supposed to do. I wouldn’t 
say we were exhaustively testing it and to be honest, I know we can’t 
push SQLite to its limits with the little things we use it for.


We did understand the differences in 3.11.10 and 3.8 re the size of the 
-wal mode, its just that I communicated it poorly. Too little sleep and 
far too much coffee.


We are going to do some more tests, more about familiarising ourselves 
with WAL rather than expecting it to break to be honest. WAL seems to 
work well enough for us and assuming our last conversion tests work OK, 
we’ll shine it in tomorrow night when we get some downtime.


Thanks for you help and elegant description

Rob

On 7 Aug 2016, at 9:59, Jean-Christophe Deschamps wrote:


Rob,

At 08:55 07/08/2016, you wrote:
We think that using WAL mode works for us, indeed inspection seems to 
indicate it does, but the size of the -wal file appears to be far 
larger than would be expected. Is there a problem here? It doesn't 
appear to be a problem but would welcome any comments.


After reading your post I'd like to clear up a few points about WAL 
mode.


We can also see that the main sqlite database is NOT updated (or at 
least the timestamp isn't) whilst we are running the updates in WAL 
mode. This appears to be correct as the updates would be in the -wal 
file.


The WAL mode is persistant and consistant. That means that once 
successfully put in his mode the DB itself will remain in WAL mode for 
every (new) connection. Thus your updates and the long-running query 
are both running under WAL mode. That is, provided the WAL mode was 
set prior to the start of the long-running query, but that detail 
doesn't matter for reads in this case.


It doesn't matter whether your query is a single query statement 
(hence in auto-commit mode) or a huge transaction extracting and 
massaging data in multiple temp tables and myriads of read/write 
statements, all inside an explicit transaction), ACID properties 
guarantee that once your query is started, it will see the DB in the 
state prior to any updates that could occur during its run. Else you 
would obtain potentially dangerously inconsistant data of course.


We have not set the journal_size_limit and we have a -wal file which 
is 224MB in size, somewhat larger than 4MB. We are running


3.8.2 2013-12-06 14:53:30 27392118af4c38c5203a04b8013e1afdb1cebd0d

which does not appear to have the code in 3.11.0 so that the WAL file 
is proportional to the size of the transaction. From the same page of 
the manual:


```
Very large write transactions.

A checkpoint can only complete when no other transactions are 
running, which
means the WAL file cannot be reset in the middle of a write 
transaction. So a large change to a large database
might result in a large WAL file. The WAL file will be checkpointed 
once the write transaction completes
(assuming there are no other readers blocking it) but in the 
meantime, the file can grow very big.


As of SQLite version 3.11.0, the WAL file for a single transaction 
should be proportional in size to
the transaction itself. Pages that are changed by the transaction 
should only be written into the WAL
file once. However, with older versions of SQLite, the same page 
might be written into the WAL file multiple

times if the transaction grows larger than the page cache.
```


Reread the quoted part again: only with SQLite versions 3.11.0 and 
above will a given page be written only once in the WAL file when 
initiated within a given transaction.


Since you're running a prior version, it's pretty logical to expect 
that your updates will cause writes of multiple distinct copies of the 
same pages in the WAL file. You should update your SQLite version to 
see a possible improvement there. That, or refer only to the old 3.8.2 
documentation, but this is an inferior option because there have been 
significant improvements meanwhile.


Also the ratio in the help file (1000 pages or about 4Mb) applies to 
the default page size (4Kb).


Finally, while the long-running query is running, no checkpoint can 
run to completion. Doc states under "Checkpoint starvation.":


However, if a database has many concurrent overlapping readers and 
there is always at least one active reader, then no checkpoints will 
be able to complete and hence the WAL file will grow without bound.


Since you clearly can't introduce a read-gap inside your read query, 
the .wal file will grow as large as it needs until completion of the 
query. You mentionned that you tested with much more frequent updates 
than the real-world case (120x actually), so .wal file size shouldn't 
be an issue in your actual use case.


HTH

--
Jean-Christophe

Re: [sqlite] Locking databases - Possibly (probably?) a dumb question

2016-08-07 Thread Rob Willett


Ryan,

Thanks for the update.

We have done a few more tests during the day and not had any issues to 
date. This is still on a test version but we are getting a warm, cuddly 
feeling about using WAL mode.


The -wal file grows as you describe and you have explained it very well. 
We were groping in the twilight to get to where we wanted to go, your 
explanation brought a bright beacon of light onto the proceedings. (I 
have been watching too many speeches from the various US political 
conventions in the US though I am British).


We will investigate changing the page size. We would need to work out 
the row size.


I will note in future your OCD and ensure that I am accurate in 
reporting numbers rather than have self inflicted rounding errors, 60x 
is a nicer number than 50x as it maps to mins and secs more easily :)


Thanks again for the help.

Rob

On 7 Aug 2016, at 12:11, R Smith wrote:


On 2016/08/07 8:55 AM, Rob Willett wrote:

Richard, Ryan,

Thanks for this. We were dimly aware of WAL but until now hadn’t 
needed to use it.


We’ve done a quick check with it and it *seems* to work on a test 
database. We’ve all read the docs again and paid attention to 
https://www.sqlite.org/wal.html#bigwal


To test if it works we started our long running analytics query, on 
our test machine it takes around 8 mins. We then speed up the rate of 
updating our database with external data. In the real world an update 
comes along every 3-5 mins, in our test system we queue them up so we 
have them every 6-10 secs so they are around 60x quicker. The updates 
are real data around 3-5MB in size.


We monitored the -wal and the .shm files created as we throw data in 
the database.


The .wal file gets larger and larger until it hits 224MB and then 
stays constant, the .shm file is only 1.8MB and seems to stay at that 
size. We can also see that the main sqlite database is NOT updated 
(or at least the timestamp isn’t) whilst we are running the updates 
in WAL mode. This appears to be correct as the updates would be in 
the -wal file.


I'm truncating this post for brevity - but basically your concern 
about the size (voiced later in the post) is not a concern. What 
happens is the stated 4MB is simply 1000 pages x 4KB default page size 
- your page size might be a lot bigger (and should be set higher 
looking at your DB size and data entry sizes - I think it is "nicer" 
if, at a minimum, a complete row can fit on a page). Further, the WAL 
for your version of SQLite will grow with copies of data and multiple 
inserts in it because of the long-running query not allowing push-back 
check points for the time - and looking at your insert frequency and 
size, your WAL size seems pretty normal. (If you manage it wrong, it 
will fill up Terrabytes - this is the situation you want to avoid, but 
I think you've got it sorted).


The Documentation simply describes the normal situation, which yours 
isn't.


Also, on a point of satisfying my OCD... going on your quoted averages 
- 5 minutes vs. 10 secs is a 30:1 ratio (5 mins = 300 seconds vs. 10 
secs) so the speed is only 30x faster, not 60) - And before anyone 
asks, yes the low end of the ranges given is 3mins (180s) vs. 6 secs 
which is also a 30:1 ratio.  Even if I take the opposite range 
extremes (5 mins vs. 6s) I still only get 50x speedup.  LoL - Sorry, 
I'll shush now :)


As an aside, I think Richard posted a small study of testing multiple 
DB ops with varying page sizes and varying hardware page sizes, and 
basically, IIRC, the Jury was out on best size in the general case 
with 8192 seeming to be a good standard and the idea that the page 
size should try match the underlying OS page size for best performance 
turned out to be a bit of a "sometimes maybe", but the point was made 
that every implementation should experiment to find the optimum size. 
That said, my memory cannot be trusted - could someone re-post that or 
point us to an on-line page somewhere? Thanks!


Cheers,
Ryan

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Locking databases - Possibly (probably?) a dumb question

2016-08-07 Thread Rob Willett


Richard, Ryan,

Thanks for this. We were dimly aware of WAL but until now hadn’t 
needed to use it.


We’ve done a quick check with it and it *seems* to work on a test 
database. We’ve all read the docs again and paid attention to 
https://www.sqlite.org/wal.html#bigwal


To test if it works we started our long running analytics query, on our 
test machine it takes around 8 mins. We then speed up the rate of 
updating our database with external data. In the real world an update 
comes along every 3-5 mins, in our test system we queue them up so we 
have them every 6-10 secs so they are around 60x quicker. The updates 
are real data around 3-5MB in size.


We monitored the -wal and the .shm files created as we throw data in the 
database.


The .wal file gets larger and larger until it hits 224MB and then stays 
constant, the .shm file is only 1.8MB and seems to stay at that size. We 
can also see that the main sqlite database is NOT updated (or at least 
the timestamp isn’t) whilst we are running the updates in WAL mode. 
This appears to be correct as the updates would be in the -wal file.


The times taken for each updates seems a little slower (10% possibly but 
that could be just because we are looking at it) but since the data is 
real and variable in size, it might be just our subjective impression.


Once the long running read-only analytics query completes, the main 
sqlite database appears to get updated (or at least the timestamp on the 
file is updated) as we are still updating with our test data and the 
-wal files are still being used.


Once we stop updating with our test data, the -wal files and .shm files 
disappear (as expected).


A quick check of the database appears to show its correct.

One question though, the size of the -wal file worries us. 
https://www.sqlite.org/wal.html#bigwal states


```
Avoiding Excessively Large WAL Files

In normal cases, new content is appended to the WAL file until the WAL 
file accumulates about 1000 pages (and is
thus about 4MB in size)  at which point a checkpoint is automatically 
run and the WAL file is recycled.
The checkpoint does not normally truncate the WAL file (unless the 
journal_size_limit pragma is set).
Instead, it merely causes SQLite to start overwriting the WAL file from 
the beginning. This is done because
it is normally faster to overwrite an existing file than to append. When 
the last connection to a database
closes, that connection does one last checkpoint and then deletes the 
WAL and its associated shared-memory

file, to clean up the disk.
```

We have not set the journal_size_limit and we have a -wal file which is 
224MB in size, somewhat larger than 4MB. We are running


3.8.2 2013-12-06 14:53:30 27392118af4c38c5203a04b8013e1afdb1cebd0d

which does not appear to have the code in 3.11.0 so that the WAL file is 
proportional to the size of the transaction. From the same page of the 
manual:


```
Very large write transactions.

A checkpoint can only complete when no other transactions are running, 
which
means the WAL file cannot be reset in the middle of a write transaction. 
So a large change to a large database
might result in a large WAL file. The WAL file will be checkpointed once 
the write transaction completes
(assuming there are no other readers blocking it) but in the meantime, 
the file can grow very big.


As of SQLite version 3.11.0, the WAL file for a single transaction 
should be proportional in size to
the transaction itself. Pages that are changed by the transaction should 
only be written into the WAL
file once. However, with older versions of SQLite, the same page might 
be written into the WAL file multiple

times if the transaction grows larger than the page cache.
```

We think that using WAL mode works for us, indeed inspection seems to 
indicate it does, but the size of the -wal file appears to be far larger 
than would be expected. Is there a problem here? It doesn’t appear to 
be a problem but would welcome any comments.


Thanks for taking the time to reply.

Rob


On 6 Aug 2016, at 22:35, R Smith wrote:


On 2016/08/06 10:50 PM, Rob Willett wrote:


Our understanding of this is that many processes can READ the 
database at the same time but NO process can INSERT/UPDATE if another 
is reading. We had thought that one process can write and multiple 
processes can read. Our reading (no pun intended) now of this 
paragraph from the manual is that you cannot write if one or more 
processes is reading. Have we understood this correctly? If so is 
there an easy way to get around this?


The Write-Ahead-Log (WAL) journal mode will help you. It basically 
allows a writer to write to the WAL Log in stead of the main database 
so that any amount of readers can still do their thing reading the 
database (and the parts of the WAL journal that is already committed, 
or even parts still in progress if you use "read_uncommitted" mode). 
SQLite then pushes committed data into the DB file based on 
Checkpo

[sqlite] Locking databases - Possibly (probably?) a dumb question

2016-08-06 Thread Rob Willett


Hi,

We’ve been using Sqlite though Perl for some time now and have started 
to get more adventurous.


Our SQLite database is around 32GB in size, is created and manipulated 
by a single Perl process and is working well.


What we now want to do is mine the database using a very long running 
query to generate analytics.We have the SQL written to query the 
database, we have requested help on this list from last year and we are 
happy with the SQL query itself. It takes a long time (between 5 and 60 
mins) as its a complex query and collates an awful lot of data. Whilst 
we would love to have the query execute in 10 secs, thats not going to 
happen due to the size of the data and the queries we need to execute.


What we have now found is that when we are running the analytics query 
in one Perl process, we can no longer UPDATE the main database through 
another Perl process. We are getting “database is locked” errors.


We don’t need any help with our SQL but we are trying to understand 
how the locking works (at a high level) in SQL. Reading the docs 
(https://www.sqlite.org/lockingv3.html) for locking seems to indicate 
the problem


```
SHARED	The database may be read but not written. Any number of processes 
can hold SHARED locks at the same time, hence there can be many 
simultaneous readers. But no other thread or process is allowed to write 
to the database file while one or more SHARED locks are active.


```

Our understanding of this is that many processes can READ the database 
at the same time but NO process can INSERT/UPDATE if another is reading. 
We had thought that one process can write and multiple processes can 
read. Our reading (no pun intended) now of this paragraph from the 
manual is that you cannot write if one or more processes is reading. 
Have we understood this correctly? If so is there an easy way to get 
around this?


Further down the page we find

```
5.1 Writer starvation

In SQLite version 2, if many processes are reading from the database, it 
might be the case that there is never a time when there are no active 
readers. And if there is always at least one read lock on the database, 
no process would ever be able to make changes to the database because it 
would be impossible to acquire a write lock. This situation is called 
writer starvation.


SQLite version 3 seeks to avoid writer starvation through the use of the 
PENDING lock. The PENDING lock allows existing readers to continue but 
prevents new readers from connecting to the database. So when a process 
wants to write a busy database, it can set a PENDING lock which will 
prevent new readers from coming in. Assuming existing readers do 
eventually complete, all SHARED locks will eventually clear and the 
writer will be given a chance to make its changes.



```

We cannot do this as we cannot set the PENDING lock as we do not know 
when we need to do a write.


If we have a single reader and therefore no writers we do have a Plan B 
(and a plan C) so whilst there is some work for us (1-2 days), we can 
put a workflow into the system to get around it.


Thanks,

Rob
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Messages posted on Nabble not getting to list

2016-05-27 Thread Rob Willett


I agree with Tim.

I filter all my SQLite messages to its own folder and read as needed.

I prefer this method to a forum as I can then keep the messages with me. 
I’m often off the internet and its good to have them for reading. Also 
I can look back through them and get hints and tips about issues.


Its not perfect but it works for me. I get around the distraction of 
emails coming in through client filtering, I did start filtering at the 
server so that only important stuff came through immediately but that 
didn’t work well.  I now use Mailmate and like the power and lack of 
graphical eye candy it offers.


I’d actively vote against a SQLite forum rather than e-mail (if 
anybody asked me, which I doubt…) SQLite is important to our work so 
having it around is fine by me,


YMMV

Rob


On 27 May 2016, at 10:34, Tim Streater wrote:

On 27 May 2016 at 08:56, Darren Duncan  
wrote:



On 2016-05-26 9:00 PM, Balaji Ramanathan wrote:
The main advantage of forums, and I follow a bunch of them, is that 
I choose
when I want to stop my regular day job and be distracted by them 
rather than

emails coming in and distracting me all the time.


That's not an argument for web forums, rather that's an argument for 
not using
your work email to subscribe to non-work discussion lists; use a 
non-work email
for the discussion lists instead.  You can also configure your email 
client to
only check email when you tell it to rather than constantly. -- 
Darren Duncan


Filter the sqlite mails into their own mailbox. They can then be read 
at a convenient moment.


I certainly don't want a whole lot of sub-forums and the like, each of 
which has to be checked in case there's something interesting there.


--
Cheers  --  Tim
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] Fastest way to backup/copy database?

2016-05-05 Thread Rob Willett

Yes, realised after I sent the e-mail that I hadn?t said we had 
created the destination file in which to run the command. You cannot 
actually run rsync ?inlace if the destination file doesn?t exist, we 
found that out earlier ;) So it was a reasonable test though only one 
test. We need to do more to see whats going on.

Top marks for spotting our deliberate omission :)

Rob

On 5 May 2016, at 17:42, J Decker wrote:

> On Thu, May 5, 2016 at 9:38 AM, Rob Willett
>  wrote:
>> Mmmm?. Initial tests are not conclusive, it does look as if using 
>> rsync
>> ?-inplace does speed things up but nowhere near as much as we 
>> anticipated.
>>
>> Testing consisted of a 6GB test database which is a backup copy from 
>> a few
>> months ago.
>>
>> We timed copying the database over using cp
>>
>> # time cp tfl.sqlite.backup t1
>>
>> real2m30.528s
>> user0m0.052s
>> sys 0m10.403s
>>
>> We then edited the database and deleted the contents of a table that 
>> would
>> have changed over the lifetime of the database. We will freely admit 
>> we have
>> no idea where this table is in the database file and have no 
>> intention of
>> finding out. Thats SQLites problem :) The file had 65,000 lines or so 
>> and
>> would have been updated regularly at the start of the database and 
>> over the
>> last few months would have had small daily updates but they would be 
>> getting
>> fewer and fewer.
>>
>> We then did
>>
>> # time rsync -av --inplace --no-whole-file tfl.sqlite.backup test_db
>> sending incremental file list
>> tfl.sqlite.backup
>>
>>
>
> was the file already in place with sending changes?  Or is it a full
> copy of the whole thing all the time?   if the later... well cp is
> gonna really be as good as it gets
>
>>
>> sent 564,081,269 bytes  received 623,255 bytes  4,805,995.95 
>> bytes/sec
>> total size is 6,067,933,184  speedup is 10.75
>>
>> real1m57.689s
>> user2m4.947s
>> sys 0m8.952s
>>
>> The ?no-while-file is apparently needed as well as ?inplace.
>>
>> We checked the md5sums at the end
>>
>> # md5sum tfl.sqlite.backup test_db t1
>> b5bd91cc9b49ee1f54a8a2d013005586  tfl.sqlite.backup
>> b5bd91cc9b49ee1f54a8a2d013005586  test_db
>> b5bd91cc9b49ee1f54a8a2d013005586  t1
>>
>> So we can see that the file integrity has been preserved which is 
>> what we
>> expect but its always good to check. Yes we know that md5sum is not 
>> perfect
>> but for this purpose its fine. However the rsync vs cp time is not as 
>> big a
>> difference as we expected. Its still taking 80% of the time of the 
>> cp.
>>
>> Our first thoughts are that 80% is still better than 100% so thats 
>> good, our
>> second thoughts are that we need to test this on a more 
>> representative set
>> of changes to the database. Deleting the contents of a table that has 
>> been
>> changed over the lifetime of the database may be the worst-case 
>> scenario as
>> it might well touch many, many pages in the database. We are 
>> certainly not
>> expert enough to comment on this assumption of how rows are 
>> distributed in
>> the database and if anybody would like to chip in, please do so.
>>
>> Our intention now is to take a more recent and representative 
>> database, run
>> a days and a weeks set of database transactions through it which is 
>> easy
>> enough, though takes time, and see how that compares.
>>
>> Thanks for the suggestion,
>>
>> Rob,
>>
>> On 5 May 2016, at 16:42, J Decker wrote:
>>
>>> Instead of cp, rsync might help it is able to send delta changes.
>>>
>>> On Wed, May 4, 2016 at 10:55 AM, Rob Willett
>>>  wrote:
>>>>
>>>> Scott,
>>>>
>>>> OK, We can see how to do this (I think). Our app is written in Perl 
>>>> and
>>>> we?d
>>>> just need to capture the command we write down. The only issue I 
>>>> can
>>>> think
>>>> of is the prepare statement and making sure we capture the right 
>>>> SQL
>>>> command. W
>>>>
>>>> We?ll dig into it and have a look,
>>>>
>>>> Thanks for taking the time to reply.
>>>>
>>>> Rob
>>>>
>>>>
>>>> On 4 May 2016, at 18:52, Scott Robison wrote:
>>>>
>>>>> On Wed, May 4, 2016 a

[sqlite] Fastest way to backup/copy database?

2016-05-05 Thread Rob Willett

Mmmm?. Initial tests are not conclusive, it does look as if using 
rsync ?-inplace does speed things up but nowhere near as much as we 
anticipated.

Testing consisted of a 6GB test database which is a backup copy from a 
few months ago.

We timed copying the database over using cp

# time cp tfl.sqlite.backup t1

real2m30.528s
user0m0.052s
sys 0m10.403s

We then edited the database and deleted the contents of a table that 
would have changed over the lifetime of the database. We will freely 
admit we have no idea where this table is in the database file and have 
no intention of finding out. Thats SQLites problem :) The file had 
65,000 lines or so and would have been updated regularly at the start of 
the database and over the last few months would have had small daily 
updates but they would be getting fewer and fewer.

We then did

# time rsync -av --inplace --no-whole-file tfl.sqlite.backup test_db
sending incremental file list
tfl.sqlite.backup

sent 564,081,269 bytes  received 623,255 bytes  4,805,995.95 bytes/sec
total size is 6,067,933,184  speedup is 10.75

real1m57.689s
user2m4.947s
sys 0m8.952s

The ?no-while-file is apparently needed as well as ?inplace.

We checked the md5sums at the end

# md5sum tfl.sqlite.backup test_db t1
b5bd91cc9b49ee1f54a8a2d013005586  tfl.sqlite.backup
b5bd91cc9b49ee1f54a8a2d013005586  test_db
b5bd91cc9b49ee1f54a8a2d013005586  t1

So we can see that the file integrity has been preserved which is what 
we expect but its always good to check. Yes we know that md5sum is not 
perfect but for this purpose its fine. However the rsync vs cp time is 
not as big a difference as we expected. Its still taking 80% of the time 
of the cp.

Our first thoughts are that 80% is still better than 100% so thats good, 
our second thoughts are that we need to test this on a more 
representative set of changes to the database. Deleting the contents of 
a table that has been changed over the lifetime of the database may be 
the worst-case scenario as it might well touch many, many pages in the 
database. We are certainly not expert enough to comment on this 
assumption of how rows are distributed in the database and if anybody 
would like to chip in, please do so.

Our intention now is to take a more recent and representative database, 
run a days and a weeks set of database transactions through it which is 
easy enough, though takes time, and see how that compares.

Thanks for the suggestion,

Rob,

On 5 May 2016, at 16:42, J Decker wrote:

> Instead of cp, rsync might help it is able to send delta changes.
>
> On Wed, May 4, 2016 at 10:55 AM, Rob Willett
>  wrote:
>> Scott,
>>
>> OK, We can see how to do this (I think). Our app is written in Perl 
>> and we?d
>> just need to capture the command we write down. The only issue I can 
>> think
>> of is the prepare statement and making sure we capture the right SQL
>> command. W
>>
>> We?ll dig into it and have a look,
>>
>> Thanks for taking the time to reply.
>>
>> Rob
>>
>>
>> On 4 May 2016, at 18:52, Scott Robison wrote:
>>
>>> On Wed, May 4, 2016 at 11:47 AM, Rob Willett
>>> 
>>> wrote:
>>>
>>>> Scott,
>>>>
>>>> Thats an interesting idea. Is there an option in SQLite to do this 
>>>> for
>>>> us,
>>>> or do we have to write a small shim in our app?
>>>>
>>>> I like the idea of this as its simple and elegant.
>>>
>>>
>>>
>>> It would require a little extra work on your part. Nothing built 
>>> into the
>>> system that would accomplish this directly. However, I've done 
>>> similar
>>> things and they don't involve a ton of overhead. You could use 
>>> another
>>> SQLite database as the append only log, or a simple text file.
>>>
>>> I'm not aware of a free lunch solution, sadly.
>>>
>>>
>>>>
>>>>
>>>> Rob
>>>>
>>>>
>>>> On 4 May 2016, at 16:51, Scott Robison wrote:
>>>>
>>>> This is going to become a bigger problem for us as the database 
>>>> will
>>>>>>>
>>>>>>> only get bigger so any advice welcomed.
>>>>>>>
>>>>>>
>>>>> Perhaps, rather than backing up the live data, you create an 
>>>>> append only
>>>>> log of each and every query you send to the database. Should you 
>>>>> need to
>>>>> restore, you replay the log of statements. Or at the appointed 
>>>>> backup
>>>>> time,
>>>>>

[sqlite] Fastest way to backup/copy database?

2016-05-05 Thread Rob Willett

Hi,

We did look at this before, and discarded the idea but I can?t 
remember why.

I?ve just looked again and seen the ?in-place option which I 
wasn?t aware of. That *might* help and be an interesting solution. We 
know we can make cp wrk, though with a little downtime. We?ll 
investigate rsync ?in-place on a closed (and definitely not working 
database), see what happens and report back. It should be easy to test.

Thanks for the information

Rob

On 5 May 2016, at 16:42, J Decker wrote:

> Instead of cp, rsync might help it is able to send delta changes.
>
> On Wed, May 4, 2016 at 10:55 AM, Rob Willett
>  wrote:
>> Scott,
>>
>> OK, We can see how to do this (I think). Our app is written in Perl 
>> and we?d
>> just need to capture the command we write down. The only issue I can 
>> think
>> of is the prepare statement and making sure we capture the right SQL
>> command. W
>>
>> We?ll dig into it and have a look,
>>
>> Thanks for taking the time to reply.
>>
>> Rob
>>
>>
>> On 4 May 2016, at 18:52, Scott Robison wrote:
>>
>>> On Wed, May 4, 2016 at 11:47 AM, Rob Willett
>>> 
>>> wrote:
>>>
>>>> Scott,
>>>>
>>>> Thats an interesting idea. Is there an option in SQLite to do this 
>>>> for
>>>> us,
>>>> or do we have to write a small shim in our app?
>>>>
>>>> I like the idea of this as its simple and elegant.
>>>
>>>
>>>
>>> It would require a little extra work on your part. Nothing built 
>>> into the
>>> system that would accomplish this directly. However, I've done 
>>> similar
>>> things and they don't involve a ton of overhead. You could use 
>>> another
>>> SQLite database as the append only log, or a simple text file.
>>>
>>> I'm not aware of a free lunch solution, sadly.
>>>
>>>
>>>>
>>>>
>>>> Rob
>>>>
>>>>
>>>> On 4 May 2016, at 16:51, Scott Robison wrote:
>>>>
>>>> This is going to become a bigger problem for us as the database 
>>>> will
>>>>>>>
>>>>>>> only get bigger so any advice welcomed.
>>>>>>>
>>>>>>
>>>>> Perhaps, rather than backing up the live data, you create an 
>>>>> append only
>>>>> log of each and every query you send to the database. Should you 
>>>>> need to
>>>>> restore, you replay the log of statements. Or at the appointed 
>>>>> backup
>>>>> time,
>>>>> you replay the day's log of statements into another database. No 
>>>>> need to
>>>>> ever take the live database offline at the cost of slightly longer
>>>>> running
>>>>> commands during the day to handle the append operation.
>>>>> ___
>>>>> sqlite-users mailing list
>>>>> sqlite-users at mailinglists.sqlite.org
>>>>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>>>>>
>>>> ___
>>>> sqlite-users mailing list
>>>> sqlite-users at mailinglists.sqlite.org
>>>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>>>>
>>>
>>>
>>>
>>> --
>>> Scott Robison
>>> ___
>>> sqlite-users mailing list
>>> sqlite-users at mailinglists.sqlite.org
>>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>>
>> ___
>> sqlite-users mailing list
>> sqlite-users at mailinglists.sqlite.org
>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
> ___
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] Fastest way to backup/copy database?

2016-05-04 Thread Rob Willett

Dan,

Thats NOT the case for us so that explains why things are slow. Mmm? 
as I recall we never did get a backup to finish?. Now we know why :)

Rob

On 4 May 2016, at 18:53, Dan Kennedy wrote:

> On 05/05/2016 12:45 AM, Rob Willett wrote:
>> Ryan,
>>
>> Ah! The penny drops, we didn?t realise that with the backup 
>> API. That explains a great deal. We must have missed that in the 
>> docs. Blast.
>
>
> There is one exception to this:
>
> If the database is written to via the same database handle that is 
> being used as the source db by the backup API, then the backup is not 
> restarted. In this case if any pages that have already been 
> transferred to the backup db are modified the new versions are written 
> into the backup db at the same time as the source is updated.
>
> Dan.
>
>
>
>
>
>>
>> We?ve looked around for other providers in Europe and the cost 
>> differences are very high. We need to be in the EU for various data 
>> protection reasons. Until now we haven?t had any issues as we 
>> don?t move a significant amount of data around in a very short 
>> period of time, so the rate limited IO has not been a problem.
>>
>> One of our options is to do what you suggest with a second database 
>> server and run them hot/warm. We had already thought of that but not 
>> got around to it as the setting up time is quite high (we need a bank 
>> of servers, feeding things from one server to another), but our 
>> immediate issue is simply copying the 10GB database. The downside of 
>> the second server is moving 10GB data files around the internet 
>> afterwards back to the failed server. Rebuilding from scratch is a 
>> pain as it takes around 2-3 weeks to rebuild the database from 
>> scratch as we have to process every file again (circa 200,000) in 
>> order and each file takes around 4-8 secs to run.
>>
>> I think the backup solution is the tried and tested Keep-It-Simple 
>> shell script. We pause the queue upstream which stops the update 
>> process, do a cp and then restart the queue again. All of this is 
>> doable in shell script.
>>
>> Rob
>>
>> On 4 May 2016, at 18:22, R Smith wrote:
>>
>>> On 2016/05/04 2:35 PM, Rob Willett wrote:
>>>> Dominque,
>>>>
>>>> We put together a quick C program to try out the C API a few weeks 
>>>> ago, it worked but it was very slow, from memory not much different 
>>>> to the sqlite command line backup system. We put it on the back 
>>>> burner as it wasn?t anywhere near quick enough.
>>>
>>> You do realize that the backup API restarts the backup once the 
>>> database content changes, right? I'm sure at the rates you describe 
>>> and update frequency, that backup would never finish. The backup API 
>>> is quite fast if your destination file is on a not-too-slow drive, 
>>> but you will have to stop the incoming data to allow it to finish.
>>>
>>> As an aside - you need a better provider, but that said, and if it 
>>> was me, I would get two sites up from two different providers, one 
>>> live, one stand-by, both the cheap sort so costs stay minimal 
>>> (usually two cheap ones are much cheaper than the next level beefy 
>>> one). Feed all updates/inserts to both sites - one then is the 
>>> backup of the other, not only data-wise, but also can easily be 
>>> switched to by simple DNS redirect should the first site/provider go 
>>> down for any reason.  The second site can easily be interfered with 
>>> / copied from / backed up / whatever without affecting the service 
>>> to the public.
>>>
>>> I only do this with somewhat critical sites, but your use-case 
>>> sounds like it might benefit from it. My second choice would be to 
>>> simply stop operations at a best-case time-slot while the backup / 
>>> copy completes.
>>>
>>> Cheers,
>>> Ryan
>>>
>>>
>>>
>>> ___
>>> sqlite-users mailing list
>>> sqlite-users at mailinglists.sqlite.org
>>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>> ___
>> sqlite-users mailing list
>> sqlite-users at mailinglists.sqlite.org
>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
> ___
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] Fastest way to backup/copy database?

2016-05-04 Thread Rob Willett

Scott,

OK, We can see how to do this (I think). Our app is written in Perl and 
we?d just need to capture the command we write down. The only issue I 
can think of is the prepare statement and making sure we capture the 
right SQL command. W

We?ll dig into it and have a look,

Thanks for taking the time to reply.

Rob

On 4 May 2016, at 18:52, Scott Robison wrote:

> On Wed, May 4, 2016 at 11:47 AM, Rob Willett 
> 
> wrote:
>
>> Scott,
>>
>> Thats an interesting idea. Is there an option in SQLite to do this 
>> for us,
>> or do we have to write a small shim in our app?
>>
>> I like the idea of this as its simple and elegant.
>
>
> It would require a little extra work on your part. Nothing built into 
> the
> system that would accomplish this directly. However, I've done similar
> things and they don't involve a ton of overhead. You could use another
> SQLite database as the append only log, or a simple text file.
>
> I'm not aware of a free lunch solution, sadly.
>
>
>>
>>
>> Rob
>>
>>
>> On 4 May 2016, at 16:51, Scott Robison wrote:
>>
>> This is going to become a bigger problem for us as the database will
>>>>> only get bigger so any advice welcomed.
>>>>>
>>>>
>>> Perhaps, rather than backing up the live data, you create an append 
>>> only
>>> log of each and every query you send to the database. Should you 
>>> need to
>>> restore, you replay the log of statements. Or at the appointed 
>>> backup
>>> time,
>>> you replay the day's log of statements into another database. No 
>>> need to
>>> ever take the live database offline at the cost of slightly longer 
>>> running
>>> commands during the day to handle the append operation.
>>> ___
>>> sqlite-users mailing list
>>> sqlite-users at mailinglists.sqlite.org
>>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>>>
>> ___
>> sqlite-users mailing list
>> sqlite-users at mailinglists.sqlite.org
>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>>
>
>
>
> -- 
> Scott Robison
> ___
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] Fastest way to backup/copy database?

2016-05-04 Thread Rob Willett

Scott,

Thats an interesting idea. Is there an option in SQLite to do this for 
us, or do we have to write a small shim in our app?

I like the idea of this as its simple and elegant.

Rob

On 4 May 2016, at 16:51, Scott Robison wrote:

>>> This is going to become a bigger problem for us as the database will
>>> only get bigger so any advice welcomed.
>
> Perhaps, rather than backing up the live data, you create an append 
> only
> log of each and every query you send to the database. Should you need 
> to
> restore, you replay the log of statements. Or at the appointed backup 
> time,
> you replay the day's log of statements into another database. No need 
> to
> ever take the live database offline at the cost of slightly longer 
> running
> commands during the day to handle the append operation.
> ___
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] Fastest way to backup/copy database?

2016-05-04 Thread Rob Willett

Ryan,

Ah! The penny drops, we didn?t realise that with the backup API. 
That explains a great deal. We must have missed that in the docs. Blast.

We?ve looked around for other providers in Europe and the cost 
differences are very high. We need to be in the EU for various data 
protection reasons. Until now we haven?t had any issues as we don?t 
move a significant amount of data around in a very short period of time, 
so the rate limited IO has not been a problem.

One of our options is to do what you suggest with a second database 
server and run them hot/warm. We had already thought of that but not got 
around to it as the setting up time is quite high (we need a bank of 
servers, feeding things from one server to another), but our immediate 
issue is simply copying the 10GB database. The downside of the second 
server is moving 10GB data files around the internet afterwards back to 
the failed server. Rebuilding from scratch is a pain as it takes around 
2-3 weeks to rebuild the database from scratch as we have to process 
every file again (circa 200,000) in order and each file takes around 4-8 
secs to run.

I think the backup solution is the tried and tested Keep-It-Simple shell 
script. We pause the queue upstream which stops the update process, do a 
cp and then restart the queue again. All of this is doable in shell 
script.

Rob

On 4 May 2016, at 18:22, R Smith wrote:

> On 2016/05/04 2:35 PM, Rob Willett wrote:
>> Dominque,
>>
>> We put together a quick C program to try out the C API a few weeks 
>> ago, it worked but it was very slow, from memory not much different 
>> to the sqlite command line backup system. We put it on the back 
>> burner as it wasn?t anywhere near quick enough.
>
> You do realize that the backup API restarts the backup once the 
> database content changes, right? I'm sure at the rates you describe 
> and update frequency, that backup would never finish. The backup API 
> is quite fast if your destination file is on a not-too-slow drive, but 
> you will have to stop the incoming data to allow it to finish.
>
> As an aside - you need a better provider, but that said, and if it was 
> me, I would get two sites up from two different providers, one live, 
> one stand-by, both the cheap sort so costs stay minimal (usually two 
> cheap ones are much cheaper than the next level beefy one). Feed all 
> updates/inserts to both sites - one then is the backup of the other, 
> not only data-wise, but also can easily be switched to by simple DNS 
> redirect should the first site/provider go down for any reason.  The 
> second site can easily be interfered with / copied from / backed up / 
> whatever without affecting the service to the public.
>
> I only do this with somewhat critical sites, but your use-case sounds 
> like it might benefit from it. My second choice would be to simply 
> stop operations at a best-case time-slot while the backup / copy 
> completes.
>
> Cheers,
> Ryan
>
>
>
> ___
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] Fastest way to backup/copy database?

2016-05-04 Thread Rob Willett

Simon,

Thanks for the reply, we were a little surprised at the UNIX cp speed as 
well. We investigated it with the provider in the past over other file 
system speed issues and it turned out that they rate limit the IO ops, 
so you can?t consume them all. Our provider runs many servers out of 
their data centre and they want to make sure that one company (us!) 
can?t monopolise all the resources. I can see their point of view and 
since we are not a critical company we choose to pay an appropriate rate 
for this. All systems have limitations one way or another. This is one 
of theirs and is acceptable given the price bracket. Now if we were 
paying ?1,000 ($1,500)/week we would have had a very different 
conversation :)

We?ll investigate the BEGIN IMMEDIATE and see what happens.

As you say though, a simple script that suspends a message queue whilst 
a cp takes place and then sim,ply turns the tap back on is very simple 
to code and to see working. I like easy, simple solutions because I?m 
simple at heart.

Best wishes,

Rob

On 4 May 2016, at 14:24, Simon Slavin wrote:

> On 4 May 2016, at 1:35pm, Rob Willett  
> wrote:
>
>> I think that unless we can get the C API to back up in a time close 
>> to that of a cp, the easiest solution is to suspend updates for 10-15 
>> mins out-of-hours and do a simple cp from there. Sometimes a change 
>> in workflow might be the easiest and simplest solution.
>
> This solution may be what you eventually come up with.  But I would 
> like to comment that I used a Unix platform to copy 15 Gigabytes 
> through FireWire from an external hard disk yesterday and that took 
> less than 10 minutes.  Simply duplicating a 10 Gigabyte file onto your 
> boot drive should take considerably less time.
>
> You may be able to use built-in SQLite mechanisms to suspend updates.  
> It might be worth using the SQLite shell tool to execute "BEGIN 
> IMMEDIATE" and see whether that does suspend operations.
>
> On the other hand there's a lot to be said for running a script at 3am 
> which quits the update program, takes the snapshot, then starts them 
> up again.  Not only will this perform the task needed but it would 
> also serve to 'reset' those updating programs in case they have a slow 
> resource leak or some other long-term bug.
>
> Simon.
> ___
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] Fastest way to backup/copy database?

2016-05-04 Thread Rob Willett

Clemens,

We have 8GB of memory which is the most our VPS provider allows. We?d 
like 32GB but its not an option for us. Our desktops have more than 
that, but the VPS provider is reasonably priced :)

We hadn?t considered the WAL mode, my conclusion is that a simple 
change to our workflow is actually simpler, we stop the database updates 
for 15 mins out of hours, cp and then restart. Its not ideal but we?re 
not running a nuclear power station or a 24/7 medical facility. Users 
*may* not get traffic updates for 15 mins at 03:00 in the morning. The 
world will keep spinning.

Rob

On 4 May 2016, at 12:58, Clemens Ladisch wrote:

> Rob Willett wrote:
>> We?re trying to backup a 10GB live running database 
>> ?as-fast-as-we-
>> possibly-can? without stopping updates coming in.
>
> How much memory do you have?  I guess you can't simply read the entire
> database file to force it into the file cache?
>
> In WAL mode, a writer does not block readers.  You have to decide
> whether you can live with its restrictions:
> http://www.sqlite.org/wal.html
>
>
> Regards,
> Clemens
> ___
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] Fastest way to backup/copy database?

2016-05-04 Thread Rob Willett

Dominque,

We put together a quick C program to try out the C API a few weeks ago, 
it worked but it was very slow, from memory not much different to the 
sqlite command line backup system. We put it on the back burner as it 
wasn?t anywhere near quick enough.

We hadn?t seen or found or even looked for RBU, when I read this I 
thought Realtime Backup Unit, I wasn?t even close :) Rats

I think that unless we can get the C API to back up in a time close to 
that of a cp, the easiest solution is to suspend updates for 10-15 mins 
out-of-hours and do a simple cp from there. Sometimes a change in 
workflow might be the easiest and simplest solution.

I know you have mentioned VFS shims further down the email trail but 
thats certainly a step too far for us.

Thanks

Rob

On 4 May 2016, at 12:22, Dominique Devienne wrote:

> On Wed, May 4, 2016 at 1:13 PM, Rob Willett
>  wrote:
>> Thanks for the reply,
>>
>> Yes Example 2 in https://www.sqlite.org/backup.html is what we are 
>> talking
>> about. It was very slow to run for us.
>
> Then maybe https://www.sqlite.org/rbu.html is your last change.
> Although I don't see how it could be faster than the Backup API.
>
> I think you should share more details of how you use the Backup API,
> so experts can provide advice on what to try to make it scale better. 
> --DD
> ___
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] Fastest way to backup/copy database?

2016-05-04 Thread Rob Willett

Thanks for the reply,

Yes Example 2 in https://www.sqlite.org/backup.html is what we are 
talking about. It was very slow to run for us.

Rob

On 4 May 2016, at 12:08, Dominique Devienne wrote:

> On Wed, May 4, 2016 at 12:44 PM, Rob Willett
>  wrote:
>> We?re trying to backup a 10GB live running database
> [...]
>
>> 2. If we use the command line sqlite  .dump > 
>>  it
>> works, but its very slow.
>
> That's going to SQL text.
> While .backup is page-based, and binary. But not incremental in the
> Shell I believe.
>
>> 3. Using the Sqlite C API works but is also very slow.
>
> Are you talking about https://www.sqlite.org/backup.html ?
> Because Example 2 is exactly your use case.
>
>> So is there any other method of doing a quick snapshot?
>
> Well, https://www.sqlite.org/backup.html is the only supported way 
> IMHO. --DD
> ___
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] Fastest way to backup/copy database?

2016-05-04 Thread Rob Willett

Hi,

We think we know the answer to this, but we?ll ask the question 
anyway.

We?re trying to backup a 10GB live running database 
?as-fast-as-we-possibly-can? without stopping updates coming in. The 
updates come every 2-3 mins, and write a chunk of data in. We can?t 
really stop the database updates, well we can but we don?t want to.

1. We had a quick look to see if we could copy the sqlite file over in 
the short interval between updates but sadly cp simply wasn?t fast 
enough. We get around 3GB copied before an update happens, which 
basically renders the cp useless.

2. If we use the command line sqlite  .dump >  
it works, but its very slow.

3. Using the Sqlite C API works but is also very slow.

4. We don?t have the option of an LVM snapshot as the file system is 
in a Container .

So is there any other method of doing a quick snapshot? Failing that, 
our solution will be to stop any updates for the duration of the cp 
command, and then restart the process afterwards. Its not the end of the 
world but it would have to be done out of normal working hours.

This is going to become a bigger problem for us as the database will 
only get bigger so any advice welcomed.

Thanks

Rob

[sqlite] Is it possible that dropping a big table takes very long

2016-04-17 Thread Rob Willett

Cecil,,

Linux reporting 8 cores is due to hyper threading on the four cores. 
Thats normal.

One of the cores reporting a high usage is normal if you have single 
threaded app that simply cannot be moved, e.g. many perl programs 
exhibit this behaviour. Thats fine and to be expected.

I do not have the whole thread for what you reported but I did read 
somebody ask if you have put your inserts between a BEGIN/END 
transaction. That will make a massive difference to your speed.  Also 
I?m unclear as to how Java fits in all of this. Perhaps you gave a 
better indication further up the thread I do not have.

There are a lot of very, very talented people on the list, I am an not 
one of them :), I would strongly urge you to provide more information 
rather than less, e.g. you provide summary information for Java by RSS. 
It would be a lot more useful to have the full raw data so people can 
look for quirks and anomalies rather than simply you interpreting it for 
us. I am NOT the best person to talk about SQLite performance, however I 
am more familiar with Unix/Linux performance and administration.

Since most people here do not have the same issues as you and I have no 
doubt they are hitting Sqlite far harder than I or you can, I would look 
at what you are doing that is different and Java keeps jumping out at 
me. I have run (though not programmed as I?m not a Java developer) a 
number of programs running Java as the backend onto large DB2 databases 
and the first thing the developers would do would be to increase the 
amount of memory that Java is allowed to allocate and use. They would 
tend to throw 4GB at each JVM or more if the sysadmins would let them.

I still don?t get a feeling for where the problem is though as the 
information is fragmentary.

Rob

On 17 Apr 2016, at 8:40, Cecil Westerhof wrote:

> 2016-04-17 1:03 GMT+02:00 Keith Medcalf :
>
>>> Have another problem also. My CPU is about 15%, but the load average 
>>> is
>>> also about 15. (This is on a Linux system.) This results (sometimes) 
>>> in a
>>> very sluggish system. Can the load be a SQLite problem, or is it a 
>>> Java
>>> problem? (When the program is not running, the load average is a lot
>>> lower.)
>>
>> You have slow CPU with multiple cores (15% means you probably have 
>> quad
>> core -- probably with Intel Hyper-Slowness as well (as in not SMT), 
>> or you
>> have dual-core SMT), and the dispatcher is brain-dead and dispatching 
>> the
>> single threaded application on the same CPU as more critical OS 
>> workers
>> resulting in swamping that core with compute use and preventing 
>> competing
>> threads (such as the OS or the gooey) from being other than -- well 
>> --
>> gooey.
>>
>> You claim this is Linux.  There are many brands and flavours (and
>> versions) or Linux, all tuned to behave differently.  You do not 
>> specify
>> the particular version and brand it is.
>>
>
> ?It never hurts to give some extra information. I only expected 
> SQLite
> help. But I do not mind to get other help. :-D
>
> The CPU is: Intel(R) Core(TM) i7-3632QM CPU @ 2.20GHz.
> It is a little strange. It has four cores but Linux thinks it has 
> eight.
> None of the eight has a high load: sometimes one 60-70%, but most of 
> the
> time a lot lower.
> The operating system is openSUSE 13.2.
>
> free -m gives:
>  total   used   free sharedbuffers 
> cached
> Mem: 15923  15737185   1300105   
> 8131
> -/+ buffers/cache:   7500   8422
> Swap: 2053   1659394?
>
>
> ?RSSMemory usage for java
> 
> RSSMemory   6 MB by PID=9705
> RSSMemory  54 MB by PID=9731
> RSSMemory  75 MB by PID=28844
> RSSMemory 112 MB by PID=18743
> RSSMemory 121 MB by PID=28880
> RSSMemory 345 MB by PID=1036
> 
> Total used RSSMemory: 713 MB
>
> swap usage for java
> 
> swap   2 MB by PID=28880
> swap   7 MB by PID=28844
> swap 128 MB by PID=9731
> swap 132 MB by PID=9705
> swap 442 MB by PID=1036
> 
> Total used swap: 711 MB
>
> The program has PID 18743 and does not use swap.
>
> ps -l 18743 gives:
> F S   UID   PID  PPID  C PRI  NI ADDR SZ WCHAN  TTYTIME CMD
> 0 S  1000 18743 28670  4  80   0 - 1718302 futex_ pts/12  45:29 java
> RandomUUID CheckUUID.properties
> ?
> My own psPid 18743 gives:
> USER   PID  PPID TT   START ELAPSED TIME STAT COMMAND
> cecil18743 28670 pts/12   Apr1616:29:42 00:45:27 Sl+  java
> RandomUUID CheckUUID.properties
>
> So it does not have a high CPU usage about 5%.
>
> I think I have to cancel the program, because I have:
> 23:03:12: Inserted6.00e+07 UUID's
> 00:50:33: Inserted6.10e+07 UUID's
> 02:56:31: Inserted6.20e+07 UUID's
> 04:56:06: Inserted6.30e+07 UUID's
> 06:53:03: Inserted6.40e+07 UUID's
> 09:14:08: Inserted

[sqlite] Is it possible that dropping a big table takes very long

2016-04-16 Thread Rob Willett

Cecil,

If you have a load average of 15 then that normally means you have a 
massively overloaded Linux box. I don?t know your system but I get 
worried around a load average of 3-4 on our boxes. Load Average is a 
very crude measurement but a high number tends to be bad.

If your CPU is only running at 15% (how do you know this?) then the 
problem is likely elsewhere. My first thought is swap space, check what 
the disk io is, the fact you mention Java would immediately make me look 
at the amount of memory allocated to the JVM. For some reason many 
JVM?s allocate a pitifully small amount of memory, Java then runs out 
of memory pretty quickly and spends the next few secs/mins/hours 
fighting with the OS for resources.

Top, netstat, ps etc are your friends here. You need to look at them and 
see whats kicking your load average into the stratosphere. I?d be 
surprised if its Sqlite, we don?t normally drop tables, but from 
memory, dropping a 10GB table took no time. It might have been seconds 
or a minute, we didn?t measure it as it wasn?t an issue.

Rob

On 16 Apr 2016, at 20:25, Cecil Westerhof wrote:

> 2016-04-16 20:36 GMT+02:00 R Smith :
>
>>
>>
>> On 2016/04/16 3:39 PM, Cecil Westerhof wrote:
>>
>>> 2016-04-16 14:52 GMT+02:00 R Smith :
>>>
>>> Let me try the 100 million rows, this may take some time - I will 
>>> post
 again when it is done.

 ?I am curious.
>>>
>>
>> Well, here it is then, 100-million rows: The INSERT took a lot of 
>> time,
>> near 5 hours, but again, the DROP was only a few seconds.
>>
>
> ?That is what I would expect, but not what I see. :'-(
>
> I am filling the database again, but now with text UUID instead of 
> blob
> UUID. That takes a ?little? more time. When it is filled I try 
> again.
>
> Have another problem also. My CPU is about 15%, but the load average 
> is
> also about 15. (This is on a Linux system.) This results (sometimes) 
> in a
> very sluggish system. Can the load be a SQLite problem, or is it a 
> Java
> problem? (When the program is not running, the load average is a lot 
> lower.)
>
> -- 
> Cecil Westerhof
> ___
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] Can't understand why I need this cast

2016-02-18 Thread Rob Willett

Richard,

The example is highly contrived. The actual SQL we are/were actually 
interested in is

strftime('%s' , starttime) >= (strftime('%s' , starttime) - (180 * 60)),

AND

strftime('%s' , starttime) <= (strftime('%s' , starttime) + (180 * 60))

The rest of the SQL was to test what was happening with strftime and 
whether integers being produced from the expressions to test the two SQL 
statements above. We thought the contrived test expressions worked OK 
which was why we were puzzled as to why

strftime('%s' , starttime) >= (strftime('%s' , starttime) - (180 * 60)),

worked and

strftime('%s' , starttime) <= (strftime('%s' , starttime) + (180 * 60))

failed, though the only difference was the change in the boolean 
operand.

As Tim Streater pointed out we would actually use epoch seconds as this 
is more natural and we have that information in another database.

However we threw/hacked the above SQL together to test an idea we had, 
specifically to find traffic incidents three hours before and three 
hours after other traffic incidents.The test database table we tested it 
with didn?t have the epoch seconds in, so we used strftime instead to 
check our idea. Thats when we got results that puzzled us, we know we 
have alternate and better solutions to actually doing the work, whats 
worrying us is that our understanding of SQLite, strftime and implicit 
cast conversions is incorrect. The fact that strftime returns a text 
string is fine and that fits in with what we expected. We simple got the 
rest of the conversions wrong. We appear to have created a boolean 
expression that worked (a false positive) and that threw us back as we 
misunderstood (quite badly) what was going on.

We now know we have to be more careful about how we use strftime and we 
will be in our logic.

Thanks to everybody for the help, we?ve learnt something new today 
which is always good.

Best wishes,

Rob

On 18 Feb 2016, at 12:45, Richard Hipp wrote:

> On 2/18/16, Rob Willett  wrote:
>>
>> select
>>  strftime('%s' , starttime),
>>  strftime('%s' , starttime) - (180 * 60),
>>strftime('%s' , starttime) + (180 * 60),
>>  strftime('%s' , starttime) - (strftime('%s' , starttime) - (180 * 
>> 60)),
>>  (strftime('%s' , starttime) + (180 * 60)) - strftime('%s' , 
>> starttime),
>>  strftime('%s' , starttime) >= (strftime('%s' , starttime) - (180 * 
>> 60)),
>>  strftime('%s' , starttime) <= (strftime('%s' , starttime) + (180 * 
>> 60))
>> from
>>  test1;
>>
>
> I don't exactly understand why you are doing the query above, but for
> whatever your purpose is, the following formulation seems simpler to
> me:
>
> SELECT
>  strftime('%s', starttime),
>  strftime('%s', starttime, '-180 seconds'),
>  strftime('%s', starttime, '+180 seconds'),
>  (julianday(starttime) - julianday(starttime','-180 seconds'))*86400,
>  (julianday(starttime, '+180 seconds') - julianday(starttime))*86400,
>  julianday(starttime) >= julianday(starttime, '-180 seconds'),
>  julianday(starttime) <= julianday(starttime, '+180 seconds')
> FROM
> test1;
>
>
>
>
> -- 
> D. Richard Hipp
> drh at sqlite.org
> ___
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] Can't understand why I need this cast

2016-02-18 Thread Rob Willett

Tim,

We actually do store the timestamps as epoch seconds, but we wrote a 
quick piece of SQL to test something out and wanted to use the ISO date. 
Thats when the SQL failed and we couldn?t understand why. We hate it 
when we don?t understand why things don?t work the way we expect. 
Our OCD kicks in and annoys us :)

We?ve investigated it further and it still makes no sense, though Quan 
Yong Zhai has helped. It appears that

strftime('%s' , starttime) >= (strftime('%s' , starttime) - (180 * 60))

works BUT

strftime('%s' , starttime) <= (strftime('%s' , starttime) + (180 * 60))

does not work.

The only difference is the boolean operand.

We know how to solve the problem, we?re puzzled though as our 
understanding is clearly wrong.

Thanks for replying,

Rob

On 18 Feb 2016, at 12:14, Tim Streater wrote:

> On 18 Feb 2016 at 10:20, Rob Willett  
> wrote:
>
>> I?m sure this is a really dumb question but I?m struggling to
>> understand why the following SQL is needed for what should be a 
>> trivial
>> SQL expression.
>>
>> I?ve minimised the example down to (hopefully) make it simpler.
>>
>> I have a table with an ISO date StartTime in it held as a string.
>
> I'd be inclined to store your dates as seconds since the epoch. That 
> way arithmetic and comparisons become easy, and your SQL looks 
> simpler. Convert to a string for display. But perhaps your application 
> prevents that for some reason.
>
> --
> Cheers  --  Tim
> ___
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] Can't understand why I need this cast

2016-02-18 Thread Rob Willett

Quad,

Thanks, that almost makes sense. What still confuses us is that the 
other maths expressions work OK without a cast.

e.g.

strftime('%s' , starttime) - (180 * 60)

and

(strftime('%s' , starttime) + (180 * 60)) - strftime('%s' , starttime)

and

strftime('%s' , starttime) >= (strftime('%s' , starttime) - (180 * 60)) 
,

all work but

strftime('%s' , starttime) <= (strftime('%s' , starttime) + (180 * 60))

doesn?t.



So why does

strftime('%s' , starttime) >= (strftime('%s' , starttime) - (180 * 60))

work and

strftime('%s' , starttime) <= (strftime('%s' , starttime) + (180 * 60))

not work.

Thanks,

Rob

On 18 Feb 2016, at 11:25, Quan Yong Zhai wrote:

> It looks like strftime('%s',...) should return an integer and 
> strftime('%J', ...) should return a double value. But it always return 
>  text value.
> ________
> ???: Rob Willett<mailto:rob.sqlite at robertwillett.com>
> : ?2016/?2/?18 18:20
> ???: SQLite mailing 
> list<mailto:sqlite-users at mailinglists.sqlite.org>
> ??: [sqlite] Can't understand why I need this cast
>
> Hi,
>
> I?m sure this is a really dumb question but I?m struggling to
> understand why the following SQL is needed for what should be a 
> trivial
> SQL expression.
>
> I?ve minimised the example down to (hopefully) make it simpler.
>
> I have a table with an ISO date StartTime in it held as a string.
>
> 
> sqlite> .schema test1
> CREATE TABLE "Test1" (
>   "Id" INTEGER NOT NULL,
>   "StartTime" TEXT NOT NULL,
>  PRIMARY KEY("Id")
> );
>
> 
>
> I add in some test ISO date data that *looks* OK
>
> 
> sqlite> select * from test1;
> 1|2011-05-03T05:00:00Z
> 
>
> I run the following to check that the ISO date is between a range of
> seconds, i.e. its greater than three hours ago and less than three 
> hours
> in the future. I know this is contrived but this is the minimum test
> case from a far larger query.
>
> select
>  strftime('%s' , starttime) ,
>  strftime('%s' , starttime) - (180 * 60) ,
>   strftime('%s' , starttime) + (180 * 60) ,
>  strftime('%s' , starttime) - (strftime('%s' , starttime) - (180 * 
> 60))
> ,
>  ,
>  strftime('%s' , starttime) >= (strftime('%s' , starttime) - (180 
> * 60))
> ,
>  strftime('%s' , starttime) <= (strftime('%s' , starttime) + (180 
> * 60))
> from
>  test1;
>
> What I expect to see is
>
> 1304398800|1304388000|1304409600|10800|10800|1|0
>
> 1. the ISO start time as secs - Correct
> 2. The ISO start time in secs minus 10800 - Correct
> 3, The ISO start time in secs plus 10800 - Correct
> 4. The value 10800 to check that the maths is correct - Correct
> 5. The value 10800 to check that the maths is correct - Correct
> 6. A check that the ISO value in secs is greater than the current time
> minus 10800 - Correct
> 7. A check that the ISO value in secs is less than the current time 
> plus
> 10800 secs - INCORRECT
>
> if I CAST the value of the last column
>
> select
>  strftime('%s' , starttime) ,
>  strftime('%s' , starttime) - (180 * 60) ,
>   strftime('%s' , starttime) + (180 * 60) ,
>  strftime('%s' , starttime) - (strftime('%s' , starttime) - (180 * 
> 60))
> ,
>  (strftime('%s' , starttime) + (180 * 60)) - strftime('%s' , 
> starttime)
> ,
>  strftime('%s' , starttime) >= (strftime('%s' , starttime) - (180 
> * 60))
> ,
>  strftime('%s' , starttime) <= cast((strftime('%s' , starttime) + 
> (180 *
> 60)) as integer)
> from
>  test1;
>
> I get the right answer
>
> 1304398800|1304388000|1304409600|10800|10800|1|1
>
> I?m puzzled as to why this should be. Why would I need to cast an
> integer to an integer. I??m sure there?s an obvious answer but I
> can?t find it.
>
> Any suggestions please?
>
> Thanks,
>
> Rob
> ___
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
> ___
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] Can't understand why I need this cast

2016-02-18 Thread Rob Willett

Hi,

I?m sure this is a really dumb question but I?m struggling to 
understand why the following SQL is needed for what should be a trivial 
SQL expression.

I?ve minimised the example down to (hopefully) make it simpler.

I have a table with an ISO date StartTime in it held as a string.


sqlite> .schema test1
CREATE TABLE "Test1" (
 "Id" INTEGER NOT NULL,
 "StartTime" TEXT NOT NULL,
PRIMARY KEY("Id")
);



I add in some test ISO date data that *looks* OK


sqlite> select * from test1;
1|2011-05-03T05:00:00Z


I run the following to check that the ISO date is between a range of 
seconds, i.e. its greater than three hours ago and less than three hours 
in the future. I know this is contrived but this is the minimum test 
case from a far larger query.

select
strftime('%s' , starttime) ,
strftime('%s' , starttime) - (180 * 60) ,
 strftime('%s' , starttime) + (180 * 60) ,
strftime('%s' , starttime) - (strftime('%s' , starttime) - (180 * 60)) 
,
(strftime('%s' , starttime) + (180 * 60)) - strftime('%s' , starttime) 
,
strftime('%s' , starttime) >= (strftime('%s' , starttime) - (180 * 60)) 
,
strftime('%s' , starttime) <= (strftime('%s' , starttime) + (180 * 60))
from
test1;

What I expect to see is

1304398800|1304388000|1304409600|10800|10800|1|0

1. the ISO start time as secs - Correct
2. The ISO start time in secs minus 10800 - Correct
3, The ISO start time in secs plus 10800 - Correct
4. The value 10800 to check that the maths is correct - Correct
5. The value 10800 to check that the maths is correct - Correct
6. A check that the ISO value in secs is greater than the current time 
minus 10800 - Correct
7. A check that the ISO value in secs is less than the current time plus 
10800 secs - INCORRECT

if I CAST the value of the last column

select
strftime('%s' , starttime) ,
strftime('%s' , starttime) - (180 * 60) ,
 strftime('%s' , starttime) + (180 * 60) ,
strftime('%s' , starttime) - (strftime('%s' , starttime) - (180 * 60)) 
,
(strftime('%s' , starttime) + (180 * 60)) - strftime('%s' , starttime) 
,
strftime('%s' , starttime) >= (strftime('%s' , starttime) - (180 * 60)) 
,
strftime('%s' , starttime) <= cast((strftime('%s' , starttime) + (180 * 
60)) as integer)
from
test1;

I get the right answer

1304398800|1304388000|1304409600|10800|10800|1|1

I?m puzzled as to why this should be. Why would I need to cast an 
integer to an integer. I??m sure there?s an obvious answer but I 
can?t find it.

Any suggestions please?

Thanks,

Rob

[sqlite] Performance comparison between SQLite and SQL Server?

2016-02-15 Thread Rob Willett

To add to the responses, we use SQLite for our main database which is 
currently around 60GB in size, so size isn?t an issue for SQLite. I am 
aware of other people with far, far larger SQLite databases. We did a 
quick test and ran up to a couple of hundred DB?s of data in SQLIte 
and it seemed fine.

We chose SQLite over other databases as

1. We didn?t need concurrency. All our data is written and read by a 
single process.
2. Its very lightweight. We can move the database around by simply 
copying the file which is fantastic for our use.
3. The speed is very, very fast. We haven?t found any significant 
speed differences FOR OUR USE CASE between Sqlite and other ?bigger? 
databases. YMMV.
4. The support is top notch. I have brought and paid for govt scale 
databases for governments and to be honest the support for SQLite is 
just as good, and to be honest I would say better than Big Red or Big 
Blue (and I used to work for Big Blue).

Thats not to say that SQLite is perfect, its not, however no database is 
perfect as everyones usage is different. I certainly wouldn?t state 
that SQLite is the answer to everyones problems and I can state with 
absolute certainty that DB2 is also not the answer for every use.

We are moving into a different phase of our development and we are 
investigating PostGIS for some of our work now, as that *MAY* be a 
better tool for some of our more exotic spatial queries and analysis. No 
reflection on SQLite but its a case of finding the right tool for the 
right job. After saying that we *may* end up using SQLite for this area 
as well.

Rob

On 15 Feb 2016, at 8:20, R Smith wrote:

> On 2016/02/15 5:21 AM, admin at shuling.net wrote:
>> Hi,
>>
>> I am just curious whether there is a performance comparison between 
>> SQLite
>> and SQL Server? Surely SQL Server will perform better on huge 
>> database with
>> thousands of tables(more than 10GB size). But whether SQLite will 
>> perform
>> better on smaller database such as one database with one table that 
>> is less
>> than 1GB?
>>
>> Thanks
>
> Some other replies have hinted at this already, but to expand: The one 
> isn't faster than the other in all cases. There are ways in which 
> SQLite is much faster even on a 100GB database - the "Lite" in SQLite 
> doesn't mean database size - it pertains more to the available 
> function-set. It can handle the very large DB's just as well as any 
> other.
>
> Where SQLite lacks (due to the "Lite"-ness) is in not having 
> user-access control, not having programmability (stored procedures and 
> functions, but then you can add custom functions to SQLite in C even, 
> which you can't easily do with the others, especially not with MSSQL). 
> The largest difference however, is that an SQLite connection operates 
> on (talks-to) a file, and the others usually talk to a server.
>
> The main advantage of SQLite is that it can be used for an application 
> file format to your program, complete as if it was a database in 
> itself (which, actually, it is), and moreover, you can embed the 
> entire DB in your application and on your hardware etc - like the 
> billions of handheld devices, phones, tablets, etc. that uses SQLite 
> daily.
>
> For more information, see:
> http://www.sqlite.org/whentouse.html
>
> To answer your speed question - it depends on data shape, size, IO 
> access speeds, Memory on the generation machine etc. Import big 
> datasets in both MSSQL and SQlite, run some queries, chances are some 
> queries be slightly faster in SQLite, and some are slightly faster in 
> MSSQL.  Single query performance is not really the driver of that 
> decision.
>
> Cheers,
> Ryan
>
> ___
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] Problem with accumulating decimal values

2015-12-11 Thread Rob Willett

I *think* this is due to you creating an integer when you first create 
the entries

Try changing from

INSERT INTO fmtemp VALUES (1, 0);

to

INSERT INTO fmtemp VALUES (1, 0.0);

Just did

macpro:js rwillett$ sqlite3
SQLite version 3.8.10.2 2015-05-20 18:17:19
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.
sqlite> CREATE TABLE fmtemp (acno INT, balance DECIMAL);
sqlite>  INSERT INTO fmtemp VALUES (1, 0.0);
sqlite> UPDATE fmtemp SET balance = balance + 123.45;
sqlite> SELECT bal FROM fmtemp;
Error: no such column: bal
sqlite> SELECT balance FROM fmtemp;
123.45
sqlite> UPDATE fmtemp SET balance = balance + 123.45;
sqlite> UPDATE fmtemp SET balance = balance + 123.45;
sqlite> UPDATE fmtemp SET balance = balance + 123.45;
sqlite> UPDATE fmtemp SET balance = balance + 123.45;
sqlite> UPDATE fmtemp SET balance = balance + 123.45;
sqlite> UPDATE fmtemp SET balance = balance + 123.45;
sqlite> UPDATE fmtemp SET balance = balance + 123.45;
sqlite> UPDATE fmtemp SET balance = balance + 123.45;
sqlite> UPDATE fmtemp SET balance = balance + 123.45;
sqlite> SELECT balance FROM fmtemp;
1234.5
sqlite> UPDATE fmtemp SET balance = balance + 123.45;
sqlite> SELECT balance FROM fmtemp;
1357.95
sqlite> UPDATE fmtemp SET balance = balance + 123.45;
sqlite> UPDATE fmtemp SET balance = balance + 123.45;
sqlite> UPDATE fmtemp SET balance = balance + 123.45;
sqlite> UPDATE fmtemp SET balance = balance + 123.45;
sqlite> UPDATE fmtemp SET balance = balance + 123.45;
sqlite> UPDATE fmtemp SET balance = balance + 123.45;
sqlite> UPDATE fmtemp SET balance = balance + 123.45;
sqlite> UPDATE fmtemp SET balance = balance + 123.45;
sqlite> UPDATE fmtemp SET balance = balance + 123.45;
sqlite> UPDATE fmtemp SET balance = balance + 123.45;
sqlite> UPDATE fmtemp SET balance = balance + 123.45;
sqlite> UPDATE fmtemp SET balance = balance + 123.45;
sqlite> UPDATE fmtemp SET balance = balance + 123.45;
sqlite> UPDATE fmtemp SET balance = balance + 123.45;
sqlite> UPDATE fmtemp SET balance = balance + 123.45;
sqlite> UPDATE fmtemp SET balance = balance + 123.45;
sqlite> UPDATE fmtemp SET balance = balance + 123.45;
sqlite> UPDATE fmtemp SET balance = balance + 123.45;
sqlite> UPDATE fmtemp SET balance = balance + 123.45;
sqlite> UPDATE fmtemp SET balance = balance + 123.45;
sqlite> UPDATE fmtemp SET balance = balance + 123.45;
sqlite> UPDATE fmtemp SET balance = balance + 123.45;
sqlite> UPDATE fmtemp SET balance = balance + 123.45;
sqlite> UPDATE fmtemp SET balance = balance + 123.45;
sqlite> UPDATE fmtemp SET balance = balance + 123.45;
sqlite> UPDATE fmtemp SET balance = balance + 123.45;
sqlite> SELECT balance FROM fmtemp;
4567.65
sqlite> UPDATE fmtemp SET balance = balance + 123.45;
sqlite> UPDATE fmtemp SET balance = balance + 123.45;
sqlite> UPDATE fmtemp SET balance = balance + 123.45;
sqlite> UPDATE fmtemp SET balance = balance + 123.45;
sqlite> UPDATE fmtemp SET balance = balance + 123.45;
sqlite> UPDATE fmtemp SET balance = balance + 123.45;
sqlite> UPDATE fmtemp SET balance = balance + 123.45;
sqlite> UPDATE fmtemp SET balance = balance + 123.45;
sqlite> UPDATE fmtemp SET balance = balance + 123.45;
sqlite> UPDATE fmtemp SET balance = balance + 123.45;
sqlite> UPDATE fmtemp SET balance = balance + 123.45;
sqlite> UPDATE fmtemp SET balance = balance + 123.45;
sqlite> UPDATE fmtemp SET balance = balance + 123.45;
sqlite> UPDATE fmtemp SET balance = balance + 123.45;
sqlite> UPDATE fmtemp SET balance = balance + 123.45;
sqlite> UPDATE fmtemp SET balance = balance + 123.45;
sqlite> UPDATE fmtemp SET balance = balance + 123.45;
sqlite> UPDATE fmtemp SET balance = balance + 123.45;
sqlite> UPDATE fmtemp SET balance = balance + 123.45;
sqlite> UPDATE fmtemp SET balance = balance + 123.45;
sqlite> UPDATE fmtemp SET balance = balance + 123.45;
sqlite> UPDATE fmtemp SET balance = balance + 123.45;
sqlite> UPDATE fmtemp SET balance = balance + 123.45;
sqlite> UPDATE fmtemp SET balance = balance + 123.45;
sqlite> UPDATE fmtemp SET balance = balance + 123.45;
sqlite> UPDATE fmtemp SET balance = balance + 123.45;
sqlite> UPDATE fmtemp SET balance = balance + 123.45;
sqlite> UPDATE fmtemp SET balance = balance + 123.45;
sqlite> UPDATE fmtemp SET balance = balance + 123.45;
sqlite> UPDATE fmtemp SET balance = balance + 123.45;
sqlite> UPDATE fmtemp SET balance = balance + 123.45;
sqlite> UPDATE fmtemp SET balance = balance + 123.45;
sqlite> UPDATE fmtemp SET balance = balance + 123.45;
sqlite> UPDATE fmtemp SET balance = balance + 123.45;
sqlite> UPDATE fmtemp SET balance = balance + 123.45;
sqlite> UPDATE fmtemp SET balance = balance + 123.45;
sqlite> UPDATE fmtemp SET balance = balance + 123.45;
sqlite> UPDATE fmtemp SET balance = balance + 123.45;
sqlite> UPDATE fmtemp SET balance = balance + 123.45;
sqlite> UPDATE fmtemp SET balance = balance + 123.45;
sqlite> UPDATE fmtemp SET balance = balance + 123.45;
sqlite> UPDATE fmtemp SET balance = balance + 123.45;

[sqlite] Handling the whole select query from an index

2015-09-22 Thread Rob Willett

A quick update on handling the the select query. We have some speed results 
which are quite illuminating and positive.

We?ve been playing around with building a few web services, one of which was 
the postcode lookup that people have kindly helped with.

We have one service that is a simple insert into a database. One piece of data 
comes in (some GPS information), and we need to get it into a database table as 
quickly as possible. Its basically a data logger. 

The other is the postcode lookup, where a user requests a GPS location based on 
a postcode. The postcode lookup uses a single table in SQLite with a unique 
postcode as a key. 

We had written some rubbish shell scripts to test both services, which was lazy 
and pretty useless. So we decided to do the job properly and do a load test 
using Siege as the weapon of choice (pun intended). We like Siege as its dead 
simple and works very well. We can argue over the merits of Siege vs roadrunner 
vs whatever later.

Both tests done on a Mac Pro quad core with 32GB of RAM.

Test 1 - Simulating 100 concurrent users doing 100 requests one after the 
other. This uses morbo as a wrapper for the Perl script

Transactions:  1 hits
Availability: 100.00 %
Elapsed time:  11.25 secs
Data transferred:   0.53 MB
Response time:  0.11 secs
Transaction rate: 888.89 trans/sec
Throughput: 0.05 MB/sec
Concurrency:   99.32
Successful transactions:   1
Failed transactions:   0
Longest transaction:0.17
Shortest transaction:   0.00

Mmmm? Just under 900 requests per second. 

Test 2 - This is simulating writing a single (small) line of data to a table. 
No special set up has been done regarding pragmas or anything like that. Its a 
pretty simple ?standard? setup. our macs have SSD rather than HDD though. This 
uses morbo as a wrapper script around the perl.

We simulate 100 concurrent users writing 100 requests again.

Transactions:  1 hits
Availability: 100.00 %
Elapsed time:  33.15 secs
Data transferred:   0.02 MB
Response time:  0.33 secs
Transaction rate: 301.66 trans/sec
Throughput: 0.00 MB/sec
Concurrency:   99.51
Successful transactions:   1
Failed transactions:   0
Longest transaction:0.52
Shortest transaction:   0.05

Thats 10,000 inserts into the database. We cleared the table out beforehand and 
then checked that the table had 10,000 records in afterwards. It did.

Test 3: As per test1 but using hypnotoad (don?t you just love the names). This 
is a more performant wrapper for Perl.

Transactions:  1 hits
Availability: 100.00 %
Elapsed time:   6.53 secs
Data transferred:   0.53 MB
Response time:  0.06 secs
Transaction rate:1531.39 trans/sec
Throughput: 0.08 MB/sec
Concurrency:   91.40
Successful transactions:   1
Failed transactions:   0
Longest transaction:0.92
Shortest transaction:   0.00

Performance is almost (ish) doubled to just over 1500 transactions per second. 
No changes to the Perl scripts.

Test 4: as per test2 but using hypnotoad. Database tables cleared out before 
starting

Transactions:  1 hits
Availability: 100.00 %
Elapsed time:  13.16 secs
Data transferred:   0.02 MB
Response time:  0.13 secs
Transaction rate: 759.88 trans/sec
Throughput: 0.00 MB/sec
Concurrency:   95.96
Successful transactions:   1
Failed transactions:   0
Longest transaction:0.50
Shortest transaction:   0.00

Double the performance to just over 750 transactions per second. No changes to 
Perl scripts.

Summary

We?re rather pleased with the performance of Mojolicious as a web server (or 
rather web service) with SQLite. The Perl script for the postcode lookup is 96 
lines long. The Perl script for the GPS capture is 190 lines and uses a 
slightly different framework for a number of reasons which are too embarrassing 
to go into. 

I have no idea how these compare to other systems, but I think 700 web based 
insertions per second is pretty good. I have no doubt we could improve the 
speed with some careful optimisations but to be brutally honest, its not worth 
it for us.

Hope this helps somebody, it validates our choice of software so we?re pretty 
pleased.

Rob

[sqlite] Handling the whole select query from an index

2015-09-19 Thread Rob Willett

Ryan,

Thanks. We?ve got 100 requests a second which may be enough. We?ll keep looking 
though for any time.

Rob.

> On 18 Sep 2015, at 18:26, R.Smith  wrote:
> 
> >>>Rob: "We want to do postal code ==> GPS..."
> 
> >>Me: "You can use google apis..."
> 
> >Rob: "Our business is GPS and GIS traffic data"
> 
> Oops yes, that's a whole nother kettle of fish then. To return to some of the 
> older parts of the conversation, I think the SQL route is best (whichever 
> engine is chosen) - I am sure you will need to expand in future. I hope you 
> get the PERL latency sorted out.
> 
> Best of luck!
> Ryan
> 
> 
> ___
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] Handling the whole select query from an index

2015-09-18 Thread Rob Willett

Yes we are aware of the Google API. Our business is GPS and GIS traffic data so 
we have actually built a lot of the infrastructure ourselves. We have our own 
map server running OpenStreetMap, a routing server and database allowing us to 
do the same queries as you mention. Its actually a lot quicker than two seconds 
in the UK.

Google is great until you want start hitting their limits which aren?t that 
high for us (in our opinion) and then they want that scarce commodity, money, 
hence we build our own stuff. Our business model wants a lot of clients, so its 
cheaper to do our own. 

Rob

> On 18 Sep 2015, at 16:54, R.Smith  wrote:
> 
> 
> 
> On 2015-09-18 08:17 AM, Rob Willett wrote:
>> What we wanted was a postcode to GPS lookup, it was not to pinpoint a house, 
>> you would need a little more information to be that specific, e.,g. house 
>> number or a name, but to get to a general area that has a central GPS point.
> 
> As an aside, you know you can already do this via a Google API right? That 
> is, you can find a GPS location for an address or approximate address (city & 
> postal code will do). Usually, if the address is not specific, the nearest 
> road intersection GPS is given or indeed 4 GPS points that denotes the 
> bounding box encircling the area.
> The country is not needed unless the city and postal code combination is 
> ambiguous... so safer to add.
> 
> You can do the reverse too, supply a GPS and obtain an address or approximate 
> address where ambiguous.
> 
> As long as you hit less than 1,000 requests per day, the service is free.
> 
> Results are returned usually in under 2 seconds in nice full XML or JSON.
> 
> Here's an example using my postal detail - one XML and one JSON:
> https://maps.googleapis.com/maps/api/geocode/xml?address=Midrand+1684+South+Africa=en
> https://maps.googleapis.com/maps/api/geocode/json?address=Midrand+1684+South+Africa=en
> 
> 
> (Might be less effort for you to just piggy-back off this until your system 
> becomes very adopted).
> 
> ___
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

1 2 >

1 - 100 of 152 matches

Mail list logo