Re: [sqlite] Intersecting multiple queries

2020-02-29 Thread Jens Alfke

> On Feb 28, 2020, at 11:49 PM, Hamish Allan  wrote:
> 
> Again, I may be making incorrect assumptions.

Remember the old Knuth quote about the danger of premature optimization. What’s 
the size of your data set? Have you tried making a dummy database of the same 
size and experimenting with queries?

Compiling a statement is very fast in my experience. Milliseconds. Doing it 
once in response to a user command probably won’t be noticeable.

It’s true that LIKE queries tend to be slow because they can’t, in general, be 
optimized. But there are two possibilities you haven’t mentioned:
1. If they all involve prefix matching (“x LIKE ‘prefix%’) then SQLite will use 
an index on column x to limit the search to rows where x starts with ‘prefix’.
2. You can create an index on a LIKE query with a specific pattern, I.e. 
“CREATE INDEX foo ON table (x LIKE ‘%something%’) where (x LIKE 
‘%something%’)”. I haven’t tested, but this should speed up a query using that 
specific LIKE test.

FTS does make these types of queries faster, as long as you’re looking for 
whole words or word prefixes.

—Jens
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] CSV import using CLI (header, NULL)

2020-02-29 Thread Warren Young
On Feb 27, 2020, at 11:51 PM, Christof Böckler  wrote:
> 
> 1. There should be no penalty for using header lines in CSV files. Thus a new 
> flag -h for .import is much appreciated.

More than that, SQLite should be able to *actively use* the header when present.

For instance, given:

foo,bar,qux
1,2,3

then there should be a way to import only the first and third columns, perhaps 
with a command like so:

.import data.csv tablename foo,qux

I’ve needed such a thing multiple times.

One case that comes to mind is that tablename has only two columns, and you’re 
purposefully subsetting the CSV data in SQLite.  This is common when importing 
DBs from outside sources, where the other DB has more columns than your local 
SQLite DB.  It’s annoying to need to preprocess the data to strip the “extra” 
columns out.  It’s even more annoying to subset it in SQLite by matching the 
outside DB’s structure in SQLite and then do the common workaround to a lack of 
ALTER TABLE DROP COLUMN.

Another case I’ve run into before is that the table layout of the CSV is the 
same as in SQLite, but you want one or more of the columns to get their default 
value for some reason, not take them from the CSV.  You can see this with 
time-varying data when the CSV is output from SQLite at time t0 and reimported 
at t1 after some of the data has gone stale, so that letting the stale columns 
revert to their defaults is better than importing obsolete data.

To make the second example concrete, imagine an HVAC monitoring system’s DB: 
the names of the stations and the wear data are still useful, but the last 
known temperature shouldn’t be imported because the old data could be entirely 
inaccurate, and fresh data should be coming along soon after the DB table is 
restored from the backup.  Meanwhile, report “I don’t know” for the temperature 
measurements.

> 2. Missing values in columns with numerical affinity should show up as NULL 
> values.

I don’t think one hard-and-fast rule works for everyone here, if for no other 
reason than that there must be people depending on the current behavior.

Instead, let it be a mode:

.blank default
.blank blank
.blank zero
.blank null

This proposed setting tells the CSV importer how to handle blanks: use the DB 
column’s default value, leave them be, make them zeroes, or make them SQL NULLs.

The default value for this setting would be “blank” for backwards compatibility.

Parallel .zero and .null settings could be added to take care of similar cases. 
 The defaults would obviously be “zero” and “null”, respectively.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Select statement with ORDER BY specified by column value

2020-02-29 Thread Keith Medcalf

SELECT prop_value FROM Table1 WHERE obj_id=10 AND prop_key='key1' AND 
(prop_tag='ios' OR prop_tag='*') ORDER BY prop_tag == 'ios' DESC LIMIT 1;

You want to order by prop_tag == 'ios' in DESCENDING order.  That is, the true 
(1) before the false (0).  The default ascending sort will sort the false (0).  
 
before the true (1) ...  This will work because prop_tag cannot be null.  If it 
could, then nulls would sort first always (they are the firstest before 
consideration of order by ascending or descending) unless you did something 
about it.

-- 
The fact that there's a Highway to Hell but only a Stairway to Heaven says a 
lot about anticipated traffic volume.

>-Original Message-
>From: sqlite-users  On
>Behalf Of Marco Bambini
>Sent: Saturday, 29 February, 2020 01:38
>To: SQLite mailing list 
>Subject: [sqlite] Select statement with ORDER BY specified by column
>value
>
>Hi all,
>
>Is there a way to specify an ORDER BY clause by column value?
>
>I have a table declared as:
>CREATE TABLE Table1 (id INTEGER PRIMARY KEY AUTOINCREMENT, obj_id
>INTEGER, prop_key TEXT, prop_value TEXT, prop_tag TEXT DEFAULT '*',
>UNIQUE(obj_id, prop_key, prop_tag))
>
>and a sample query:
>SELECT prop_value FROM Table1 WHERE obj_id=10 AND prop_key='key1' AND
>(prop_tag='ios' OR prop_tag='*') ORDER BY (prop_tag='ios') LIMIT 1;
>
>I would like to prioritise results based on the fact that the prop_tag
>column is 'ios'.
>
>Thanks.
>--
>Marco Bambini
>https://www.sqlabs.com
>
>
>
>
>___
>sqlite-users mailing list
>sqlite-users@mailinglists.sqlite.org
>http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users



___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] After column add, what should be done to update the schema?

2020-02-29 Thread Anthony DeRobertis

On 2/27/20 7:03 PM, Andy KU7T wrote:

Hi,

I use a simple ALTER TABLE ADD COLUMN statement. However, when I use the Sqlite 
Expert, the DDL is not reflected.


One thing to be aware of is that when SQLite adds the column, it often 
doesn't format it like you'd expect. For example:


CREATE TABLE a (
    col1 integer not null primary key,
    col2 integer
);

adding a column may well wind up with something like:

CREATE TABLE a (
    col1 integer not null primary key,
    col2 integer, col3 integer
);

... note how it's been tacked on to the same line as col2. That can make 
it easy to miss when reading through a pile of SQL.


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] CSV import using CLI (header, NULL)

2020-02-29 Thread Dominique Devienne
On Sat, Feb 29, 2020 at 1:42 PM Shawn Wagner 
wrote:

> To import a csv file with headers into an existing table, you can use
> .import '| tail -n +2 yourfile.csv' yourtable
> to skip the header line.


On unix. And by shell’ing out to native tools, so not portable.
The cli ought to have something built in, if it doesn’t already.

>
>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] CSV import using CLI (header, NULL)

2020-02-29 Thread Shawn Wagner
To import a csv file with headers into an existing table, you can use

.import '| tail -n +2 yourfile.csv' yourtable

to skip the header line.

On Sat, Feb 29, 2020, 4:30 AM Christof Böckler  wrote:

> Hi,
>
> I want to share some thoughts and make some suggestions about the SQLite
> 3 command line interface (CLI) tool, especially its behaviour when
> importing CSV files.
>
> CSV files are probably even more common for data exchange than SQLite
> database files. I consider it to be good practice to include a header
> line with column names in every CSV file. Metadata should go with the
> data. This prevents me from mixing up two columns that contain numbers
> with similiar value distributions.
>
> Let’s look at an example. A file named data.csv contains three lines:
>
> id,val1,val2
> A,27,8
> B,3,12
>
> Now …
> sqlite3
> sqlite> .import data.csv tab
>
> works and looks good at first, but there is a problem with numerical data.
>
> sqlite> SELECT * FROM tab ORDER BY val2;
> B,3,12
> A,27,8
>
> This is because on import all three columns were created with affinity
> (not to say data type) TEXT (see .schema). As a consequence all numbers
> were imported as strings.
> '12' < '8' is lexicographically OK, but not so in a mathematical sense.
> Having the CSV file in mind I clearly expect to see 8 on the first line
> of the above result.
>
> How to work around this? Just define the table in advance with
> appropriate data types (affinity INTEGER):
>
> sqlite> CREATE TABLE tab (id TEXT, val1 INT, val2 INT);
>
> But now the above .import command will not work as expected, because it
> will result in three rows in our table tab. The first row contains the
> header line.
>
> Two different workarounds come to my mind:
> a) sqlite> DELETE FROM tab WHERE rowid = 1;  -- Dirty hack!
> b) sqlite> .import data.csv temp_tab
> sqlite> INSERT INTO tab SELECT * FROM temp_tab;
> sqlite> DROP TABLE temp_tab;  -- Two tedious extra lines
>
> Both approaches are not very appealing to me. To make CSV files with a
> header line first class citizens, I suggest this instead:
> sqlite> .import -h data.csv tab
> should ignore the first line of data.csv. This import should fail if
> table tab does not already exist.
> This is both shorter and more elegant than both workarounds.
>
>
> Now on to a second issue. Let’s assume you have sucessfully imported a
> file containing these four lines:
> id,val1,val2
> A,27,8
> B,3,12
> C,,1
> into the table tab mentioned above, resulting in three rows. Notice the
> missing value in column val1 on the last line. This missing value is
> imported as an empty string '' regardlesse of the affinity of column val1.
>
> That leads to (mathematically) unexpected results from aggregate functions:
> sqlite> SELECT SUM(val1) FROM tab;
> 30-- Well, treating '' like 0 is OK in this case
> sqlite> SELECT COUNT(val1) FROM tab;
> 3 -- but not so here; only two values/numbers were given in data.csv
> sqlite> SELECT AVG(val1) FROM tab;
> 10-- the average of 3 and 27 is 15
> sqlite> SELECT MAX(val1) FROM tab;
> ''-- not to be expected when looking at data.csv
>
> OK, I hear you say, what about this workaround:
> sqlite> UPDATE tab SET val1 = NULL WHERE val1 = '';
> This makes the surprises above go away, but it is again tedious to do for
> all columns containing only numbers.
>
> My suggestion is: If someone goes the extra mile and defines a table in
> advance in order to have an appropriate numerical affinity (INTEGER, REAL
> or NUMERIC) for a column, then it is OK to encode a missing value as NULL
> instead of ''. It seems right though to keep the current behaviour for
> columns with affinity TEXT (the default) or BLOB.
>
> To sum things up:
> 1. There should be no penalty for using header lines in CSV files. Thus a
> new flag -h for .import is much appreciated.
> 2. Missing values in columns with numerical affinity should show up as
> NULL values.
>
> Thanks for reading, I look forward to your opinions about these issues.
>
> Greetings
> Christof
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] CSV import using CLI (header, NULL)

2020-02-29 Thread Christof Böckler

Hi,

I want to share some thoughts and make some suggestions about the SQLite 
3 command line interface (CLI) tool, especially its behaviour when 
importing CSV files.


CSV files are probably even more common for data exchange than SQLite 
database files. I consider it to be good practice to include a header 
line with column names in every CSV file. Metadata should go with the 
data. This prevents me from mixing up two columns that contain numbers 
with similiar value distributions.


Let’s look at an example. A file named data.csv contains three lines:

id,val1,val2
A,27,8
B,3,12

Now …
sqlite3
sqlite> .import data.csv tab

works and looks good at first, but there is a problem with numerical data.

sqlite> SELECT * FROM tab ORDER BY val2;
B,3,12
A,27,8

This is because on import all three columns were created with affinity 
(not to say data type) TEXT (see .schema). As a consequence all numbers 
were imported as strings.
'12' < '8' is lexicographically OK, but not so in a mathematical sense. 
Having the CSV file in mind I clearly expect to see 8 on the first line 
of the above result.


How to work around this? Just define the table in advance with 
appropriate data types (affinity INTEGER):


sqlite> CREATE TABLE tab (id TEXT, val1 INT, val2 INT);

But now the above .import command will not work as expected, because it 
will result in three rows in our table tab. The first row contains the 
header line.


Two different workarounds come to my mind:
a) sqlite> DELETE FROM tab WHERE rowid = 1;  -- Dirty hack!
b) sqlite> .import data.csv temp_tab
   sqlite> INSERT INTO tab SELECT * FROM temp_tab;
   sqlite> DROP TABLE temp_tab;  -- Two tedious extra lines

Both approaches are not very appealing to me. To make CSV files with a 
header line first class citizens, I suggest this instead:

sqlite> .import -h data.csv tab
should ignore the first line of data.csv. This import should fail if 
table tab does not already exist.

This is both shorter and more elegant than both workarounds.


Now on to a second issue. Let’s assume you have sucessfully imported a 
file containing these four lines:

id,val1,val2
A,27,8
B,3,12
C,,1
into the table tab mentioned above, resulting in three rows. Notice the 
missing value in column val1 on the last line. This missing value is 
imported as an empty string '' regardlesse of the affinity of column val1.


That leads to (mathematically) unexpected results from aggregate functions:
sqlite> SELECT SUM(val1) FROM tab;
30-- Well, treating '' like 0 is OK in this case
sqlite> SELECT COUNT(val1) FROM tab;
3 -- but not so here; only two values/numbers were given in data.csv
sqlite> SELECT AVG(val1) FROM tab;
10-- the average of 3 and 27 is 15
sqlite> SELECT MAX(val1) FROM tab;
''-- not to be expected when looking at data.csv

OK, I hear you say, what about this workaround:
sqlite> UPDATE tab SET val1 = NULL WHERE val1 = '';
This makes the surprises above go away, but it is again tedious to do for 
all columns containing only numbers.


My suggestion is: If someone goes the extra mile and defines a table in 
advance in order to have an appropriate numerical affinity (INTEGER, REAL 
or NUMERIC) for a column, then it is OK to encode a missing value as NULL 
instead of ''. It seems right though to keep the current behaviour for 
columns with affinity TEXT (the default) or BLOB.


To sum things up:
1. There should be no penalty for using header lines in CSV files. Thus a 
new flag -h for .import is much appreciated.
2. Missing values in columns with numerical affinity should show up as 
NULL values.


Thanks for reading, I look forward to your opinions about these issues.

Greetings
Christof
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Select statement with ORDER BY specified by column value

2020-02-29 Thread Simon Slavin
On 29 Feb 2020, at 8:37am, Marco Bambini  wrote:

> ORDER BY (prop_tag='ios') LIMIT 1;
> 
> I would like to prioritise results based on the fact that the prop_tag column 
> is 'ios'.

SQLite has a conditional construction:

CASE prop_tag WHEN 'ios' THEN 0 ELSE 1 END

So do

SELECT …
ORDER BY CASE prop_tag WHEN 'ios' THEN 0 ELSE 1 END
LIMIT 1;
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Select statement with ORDER BY specified by column value

2020-02-29 Thread Marco Bambini
Hi all,

Is there a way to specify an ORDER BY clause by column value?

I have a table declared as:
CREATE TABLE Table1 (id INTEGER PRIMARY KEY AUTOINCREMENT, obj_id INTEGER, 
prop_key TEXT, prop_value TEXT, prop_tag TEXT DEFAULT '*', UNIQUE(obj_id, 
prop_key, prop_tag))

and a sample query:
SELECT prop_value FROM Table1 WHERE obj_id=10 AND prop_key='key1' AND 
(prop_tag='ios' OR prop_tag='*') ORDER BY (prop_tag='ios') LIMIT 1;

I would like to prioritise results based on the fact that the prop_tag column 
is 'ios'.

Thanks.
--
Marco Bambini
https://www.sqlabs.com




___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users