[sqlite] GROUP_CONCAT separator and DISTINCT

2017-08-25 Thread Dave Blake
It seems that it is not possible to specify the concatenation separator
when using GROUP_CONCAT with DISTINCT.

For example while this works

SELECT pub_id, GROUP_CONCAT(cate_id, " - ")
FROM book_mast
GROUP BY pub_id;

and this works

SELECT pub_id, GROUP_CONCAT(DISTINCT cate_id)
FROM book_mast
GROUP BY pub_id;

this does not

SELECT pub_id, GROUP_CONCAT(DISTINCT cate_id, " - ")
FROM book_mast
GROUP BY pub_id;

Is that an error, or by design?
Is there another way I can specify the separator when using DISTINCT?
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Deleting records from a large table

2017-03-03 Thread Dave Blake
Thanks all for your input, it has really helped.

In my real world application tmp_keep is a temporary table populated by
examinining a number of other tables etc., and I suddenly realsied that it
could even contain duplicate ids.  Sloppy thinking on my part.

I get the best results by creating another table:
CREATE TEMPORARY TABLE tmp_keep_unique (id1 integer primary key);
INSERT INTO tmp_keep_unique SELECT DISTINCT id1 from tmp_keep;

It takes far longer to create an index on tmp_keep, than it save times on
the above query with one.

Then
*with a primary key on both table1 and tmp_keep_unique*DELETE FROM table1
WHERE id1 NOT IN (SELECT id1 FROM tmp_keep_unique);

is acceptably efficient.

On SQLite NOT EXISTS is quicker than NOT IN, but I also need to use same
SQL on a MySQL implementation, and it behaves the oppoiste.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Deleting records from a large table

2017-03-03 Thread Dave Blake
Could be keep almost all the records so ~50, but it varies greatly so
sometimes will be just keep 10. I can adjust approach depending on size if
necessary.

Yes the id1 are integer primary keys.

Table1 has a number of indexes and views, so the create new table approach
is less attractive
​
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Deleting records from a large table

2017-03-03 Thread Dave Blake
Say table1 has more then 50 records, and there is a second table
tmp_keep with the ids of the records in table1 to be kept, the rest need to
be deleted. The number of records in tmp_keep can vary from 0 to all the
records in table1, with any values in between.

What is the best strategy for doing the deletion?

For deleting a large number of records (tmp_keep is small), this works
fine:
DELETE FROM table1 WHERE id1 NOT IN (SELECT id1 FROM tmp_keep);

But this becomes inefficient when tmp_keep is large.

Any suggestions?
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] IN verses EXISTS Query Speed

2016-10-16 Thread Dave Blake
Some simple testing is showing using an EXISTS statement is generally
quicker then using an IN

e.g.

SELECT * FROM tablea
WHERE EXISTS (SELECT 1 FROM tableb WHERE tablea.id  = tableb.id AND ...)

is quicker than
SELECT * FROM tablea
WHERE tablea.id IN  (SELECT tableb.id FROM tableb WHERE ...)

Is there any reason for this to be always true in SQLite, or is it query
dependant?
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Using Bitwise Logic In Query

2016-09-08 Thread Dave Blake
Looking for the best way to query a table with an integer column by value
of the lower 16 bits of the data in that column. Does SQLite support
bitwise logic?
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] VARCHAR or TEXT to save sapce

2016-09-02 Thread Dave Blake
Can I confirm that defining feilds as VARCHAR(20) rather than TEXT, if I
know I only want 20 chars or less,  will result in a smaller database?
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Bug: LEFT JOIN on view confusing the query optimiser

2016-02-14 Thread Dave Blake
Thank you for your help. I now understand why the optimiser has
difficulties and 1) is slower than 2)

Thank you also for the union suggestion, although not sure in this case
that it makes anything more readable than breaking the view into tables.

In my real world application my solution is to avoid the left join by
ensuring that all songs have at least one artist. This is a better data
design in the end.

On 14 February 2016 at 18:00, Clemens Ladisch  wrote:

> Dave Blake wrote:
> > Is there anything I can do to get the optimiser to perform 1) with the
> same
> > efficiency as 2)?
>
> See point 3 of <http://www.sqlite.org/optoverview.html#flattening>; the
> optimizer has problems when you use a subquery (i.e., a view) at the
> right side of a left join.
>
> If you really want to keep the view, you could replace the outer join
> with an inner join, and add the non-matching rows separately:
>
> SELECT A.*, view.* FROM A JOIN view ON ... WHERE ...
> UNION ALL
> SELECT A.*, NULL FROM A WHERE ... AND id NOT IN (SELECT A_id FROM view);
>
> Whether this is better than breaking up the view is something you have
> to decide yourself.
>
>
> Regards,
> Clemens
> ___
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>


[sqlite] Bug: LEFT JOIN on view confusing the query optimiser

2016-02-14 Thread Dave Blake
>You need to sort the output of EXPLAIN QUERY PLAN.

Apologies for not spoting the need to sort the query plan! A fundamental
error on my part.

However, in real world application 1) is significantly slower than 2)
despite returing the same number of rows. If I correctly read the plans
this time, it is because 2) searches the song_artist table using an index
but 1) scans the entire table.

Is there anything I can do to get the optimiser to perform 1) with the same
efficiency as 2)?

In real world application I use a "songartistview" as a means to list and
alias name the fields wanted from joins over several tables. This was fine
until I needed to use it in a left join, and the speed plumited. For
readablity and management I would like to continute to use a view, but it
is so slow. Is there any alternate to having to use tables like 2) does?

On 14 February 2016 at 15:15, Clemens Ladisch  wrote:

> Dave Blake wrote:
> > What I see as wrong is that in 1) (and 4) ) we have a query of the form
> > A LEFT JOIN B WHERE clause involving index fields on A
> >
> > yet the optimiser does not search A, the outer table, first using the
> index.
>
> EXPLAIN QUERY PLAN SELECT song.*, songartistview.* FROM song LEFT JOIN
> songartistview ON song.idSong = songartistview.idSong WHERE song.idSong =1;
> 1|0|0|SCAN TABLE song_artist
> 1|1|1|SEARCH TABLE artist USING INTEGER PRIMARY KEY (rowid=?)
> 0|0|0|SEARCH TABLE song USING INTEGER PRIMARY KEY (rowid=?)
> 0|1|1|SCAN SUBQUERY 1
>
> You need to sort the output of EXPLAIN QUERY PLAN.
> The song table is searched first.
>
>
> Regards,
> Clemens
> ___
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>


[sqlite] Bug: LEFT JOIN on view confusing the query optimiser

2016-02-14 Thread Dave Blake
>1) has "song LEFT JOIN (song_artist JOIN artist)".
>3) has "(song LEFT JOIN song_artist) JOIN artist".
>2) has "(song LEFT JOIN song_artist) LEFT JOIN artist".
>4) has "song LEFT JOIN (song_artist LEFT JOIN artist)".

OK, I see how you are saying the queries differ.

What I see as wrong is that in 1) (and 4) ) we have a query of the form
A LEFT JOIN B WHERE clause involving index fields on A

yet the optimiser does not search A, the outer table, first using the index.

You rightly said
> A left join forces SQLite to scan the outer table first.  This is
> necessary for the join to work correctly.

But it is not scanning song, the outer table, first. It is not optimising
correctly.

Is there a way with 1) to get it to scan song first? Can you see why that
would be the optimal plan?

On 14 February 2016 at 10:49, Clemens Ladisch  wrote:

> Dave Blake wrote:
> > 3) SELECT song.*, song_artist.*, artist.*
> > FROM song LEFT JOIN song_artist ON song.idSong = song_artist.idSong
> > JOIN artist ON song_artist.idArtist = artist.idArtist
> > WHERE song.idSong =1
> >
> > This ... has the same join combination as 1).
>
> No.
> 1) has "song LEFT JOIN (song_artist JOIN artist)".
> 3) has "(song LEFT JOIN song_artist) JOIN artist".
>
> > Similarly trying a view using outer join
> >
> > CREATE VIEW songartistleftview AS SELECT song_artist.idSong AS idSong,
> > song_artist.idArtist AS idArtist,
> > artist.strArtist AS strArtist
> > FROM song_artist LEFT JOIN artist ON song_artist.idArtist =
> artist.idArtist
> >
> > New query
> > 4) SELECT song.*, songartistleftview.* FROM song
> > LEFT JOIN songartistleftview ON song.idSong = songartistleftview.idSong
> > WHERE song.idSong =1
> >
> > 4) has same slow query plan as 1) despite having all left joins like 2).
>
> There are still differences:
> 2) has "(song LEFT JOIN song_artist) LEFT JOIN artist".
> 4) has "song LEFT JOIN (song_artist LEFT JOIN artist)".
>
>
> Regards,
> Clemens
> ___
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>


[sqlite] Bug: LEFT JOIN on view confusing the query optimiser

2016-02-13 Thread Dave Blake
>The result is correct.

I am so surprized by your assertion I am sure there is some
misunderstanding. The way 1) is performed, scan song_artist first,  is
suboptimal. Try my example with ~8000 rows and a where clause that returns
multiple songs and you will see what I mean.

>This would not necessarily be the case if it
>executed an outer join as if it were an inner join.

Not sure what you are saying here?

>A left join forces SQLite to scan the outer table first.  This is
>necessary for the join to work correctly.

Yes. With "A LEFT JOIN B" the outer table is "A", and I expect it to be
scanned first. If there is a where clause on index fields from A I expect
that index to be searched first. That is how 2) is done, but in 1) it does
not.

>No, the difference is that the view uses an inner join, while your
>second query use only outer joins.

OK, look at this variation
3) SELECT song.*, song_artist.*, artist.*
FROM song LEFT JOIN song_artist ON song.idSong = song_artist.idSong
JOIN artist ON song_artist.idArtist = artist.idArtist
WHERE song.idSong =1

This produces the same (optimum) query plan as 2), and has the same join
combination as 1). Of couse if there are no song_artist records for idSong
= 1 then this format  of query will not return the same results as 1) or 2)

Similarly trying a view using outer join

CREATE VIEW songartistleftview AS SELECT song_artist.idSong AS idSong,
song_artist.idArtist AS idArtist,
artist.strArtist AS strArtist
FROM song_artist LEFT JOIN artist ON song_artist.idArtist = artist.idArtist

New query
4) SELECT song.*, songartistleftview.* FROM song
LEFT JOIN songartistleftview ON song.idSong = songartistleftview.idSong
WHERE song.idSong =1

4) has same slow query plan as 1) despite having all left joins like 2).

I genuinely believe that the optimiser is choosing the wrong plan for 1)
and 4). I do not see your argument for correctness. Could you please look
at this again.

Moreover earlier versions of SQLite e.g. 3.8.3.1 got it right and use same
efficient plan for 1), 2), 3) and 4). The current behaviour is making views
unusable in my application.

On 12 February 2016 at 20:47, Clemens Ladisch  wrote:

> Dave Blake wrote:
> >> It chooses a _correct_ plan.
> >
> > Really? With query 1) to select a song it first scans the song_artist
> table
> > that could contain many thousands of records. That does not seem right.
>
> The result is correct.  This would not necessarily be the case if it
> executed an outer join as if it were an inner join.
>
> A left join forces SQLite to scan the outer table first.  This is
> necessary for the join to work correctly.
>
> >> Your queries are quite different.
> >
> > They produce the same result sets, but yes one uses a view and the other
> > the constitent tables.
>
> No, the difference is that the view uses an inner join, while your
> second query use only outer joins.
>
>
> Regards,
> Clemens
> ___
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>


[sqlite] Bug: LEFT JOIN on view confusing the query optimiser

2016-02-12 Thread Dave Blake
Thank you for looking at this.

>It chooses a _correct_ plan.

Really? With query 1) to select a song it first scans the song_artist table
that could contain many thousands of records. That does not seem right.

>Your queries are quite different.

They produce the same result sets, but yes one uses a view and the other
the constitent tables.

>What is your query actually supposed to do?

I have simplified things to make the issue clear, obviously unsuccesfuly.
In my application the query is more complex.

In this example there is a many to many relationship between songs and
artists resolved using a song_artist link table. Both 1) and 2) return the
song fields and the artist fields for all the artists for a chosen song.
More realistically the song and artist tables would have many more columns,
and the where clause be more involved and result in more than one song. But
none of those things are relevent to the issue.

Importantly you say

>It has exactly the same query plan as the query with the view.

That is not what I am getting. I have listed the query plans I get with
3.8.10.1, what version are you using and what query plans do you get?

On 12 February 2016 at 13:34, Clemens Ladisch  wrote:

> Dave Blake wrote:
> > I noticed my queries going very slowly after changing a join to a left
> > join, examination of the query plan showed that the optimiser was
> choosing
> > a poor plan.
>
> It chooses a _correct_ plan.
>
> > It only occurs when the left join is on a views, if explicit
> > tables are used instead then the problem does not occur.
>
> Your queries are quite different.  (One less LEFT.)
>
> This is the equivalent query with tables:
>
> SELECT song.*, song_artist.*, artist.*
> FROM song
> LEFT JOIN (song_artist JOIN artist
>ON song_artist.idArtist = artist.idArtist
>   ) AS songartistview
>  ON song.idSong = songartistview.idSong
> WHERE song.idSong =1;
>
> It has exactly the same query plan as the query with the view.
>
> > In trying to get the artist(s) for a song ...
>
> A query to get the artist(s) for a song would look like this:
>
> SELECT *
> FROM artist
> WHERE idArtist IN (SELECT idArtist
>FROM song_artist
>WHERE idSong = 1);
>
> What is your query actually supposed to do?
>
>
> Regards,
> Clemens
> ___
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>


[sqlite] Bug: LEFT JOIN on view confusing the query optimiser

2016-02-12 Thread Dave Blake
I noticed my queries going very slowly after changing a join to a left
join, examination of the query plan showed that the optimiser was choosing
a poor plan. It only occurs when the left join is on a views, if explicit
tables are used instead then the problem does not occur.

To repeat the issue try this example.
CREATE TABLE song ( idSong integer primary key, strTitle varchar(512))
CREATE TABLE song_artist ( idArtist integer, idSong integer)
CREATE TABLE artist (idArtist integer primary key, strArtist
varchar(256), strBio  text)

CREATE INDEX idxSongArtist_1 ON song_artist ( idSong);
CREATE INDEX idxSongArtist_2 ON song_artist ( idArtist);

CREATE VIEW songartistview AS SELECT song_artist.idSong AS idSong,
song_artist.idArtist AS idArtist, artist.strArtist AS strArtist FROM
song_artist JOIN artist ON song_artist.idArtist = artist.idArtist

In the data there can be songs with no artist, and artists with no song
(hence the use of left join)

Compare the query plan of
1) SELECT song.*, songartistview.* FROM song LEFT JOIN songartistview ON
song.idSong = songartistview.idSong WHERE song.idSong =1

with
2) SELECT song.*, song_artist.*, artist.* FROM song LEFT JOIN song_artist
ON song.idSong = song_artist.idSong LEFT JOIN artist ON
song_artist.idArtist = artist.idArtist WHERE song.idSong =1


Query 2) sensibly does
"SEARCH TABLE song USING INTEGER PRIMARY KEY (rowid=?)"
"SEARCH TABLE song_artist USING INDEX idxSongArtist_1 (idSong=?)"
"SEARCH TABLE artist USING INTEGER PRIMARY KEY (rowid=?)"

But the view equivalent 1) does
"SCAN TABLE song_artist"
"SEARCH TABLE artist USING INTEGER PRIMARY KEY (rowid=?)"
"SEARCH TABLE song USING INTEGER PRIMARY KEY (rowid=?)"
"SCAN SUBQUERY 1"

In trying to get the artist(s) for a song, scanning the full song_artist
table is not an efficient place to start!
Note this is a greatly simplified example to show the issue, in real
application the impact of a suboptimal plan is significant.

My testing was done mostly in v3.8.10.2, but this also happens in v3.8.6,
yet does not seem to occur in v3.8.3.1

Running ANALYZE on my real world data made the issue even worse (plan
started with scan of artist table instead). My current work around is to
convert my views into tables, but it would be nice to be able to use views
as an alias for some table joins. I did consider using "CROSS JOIN" because
SQLite effectively treats that as a LEFT JOIN but does not left the
optimiser change the table order, however that makes my SQL non-generic and
my queries need to work on MySQL too.