Re: [sqlite] Please help me optimize this LEFT JOIN query.

2013-09-05 Thread Petite Abeille

On Sep 5, 2013, at 9:45 PM, Yuzem  wrote:

> It is incredibly fast but it gives me the wrong result:

Yeah… you forgot the 'where' clause in the subquery… you need to restrict it on 
the genre 


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Please help me optimize this LEFT JOIN query.

2013-09-05 Thread Yuzem
Thanks Petite Abeille, I translated your code to this:
SELECT genres.genres,
   (
SELECT  substr(group_concat(name,' '),1,60)
FROM
(
  SELECTname
  FROM  movies

  JOIN  genres
  ONmovies.movies = genres.movies

  ORDER BY  movies.name

  LIMIT 6
)
) AS sample

FROM  genres
GROUP BY  genres
ORDER BY  genres.genres;
time: 0.028s

It is incredibly fast but it gives me the wrong result:
Action|American Graffiti American Graffiti American Graffiti Mulhol
Adventure|American Graffiti American Graffiti American Graffiti Mulhol
Animation|American Graffiti American Graffiti American Graffiti Mulhol
Biography|American Graffiti American Graffiti American Graffiti Mulhol
Etc...

Am I doing something wrong?



--
View this message in context: 
http://sqlite.1065341.n5.nabble.com/Please-help-me-optimize-this-LEFT-JOIN-query-tp71000p71035.html
Sent from the SQLite mailing list archive at Nabble.com.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Please help me optimize this LEFT JOIN query.

2013-09-05 Thread Petite Abeille

On Sep 5, 2013, at 11:27 PM, Yuzem  wrote:

> Any clue on why LEFT JOIN is so slow when used with genres but not with
> larger tables?

Sure. But your conclusion is a most likely a red herring. The crux of the 
matter is that inner and outer join have a wildly different semantic. And 
therefore execution plan. That's all. Seems all very reasonable from an 
optimizer point of view.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Please help me optimize this LEFT JOIN query.

2013-09-05 Thread Yuzem
Petite Abeille-2 wrote
> Sure. But your conclusion is a most likely a red herring. The crux of the
> matter is that inner and outer join have a wildly different semantic. And
> therefore execution plan. That's all. Seems all very reasonable from an
> optimizer point of view.

But I have no conclusion, I acknowledge that inner and outer join are
different but I don't know why LEFT JOIN works very fast on larger tables
and very slow on smaller tables at least in those cases. I would like to
have a conclusion on that matter.



--
View this message in context: 
http://sqlite.1065341.n5.nabble.com/Please-help-me-optimize-this-LEFT-JOIN-query-tp71000p71042.html
Sent from the SQLite mailing list archive at Nabble.com.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Please help me optimize this LEFT JOIN query.

2013-09-05 Thread Petite Abeille

On Sep 5, 2013, at 10:28 PM, Yuzem  wrote:

> Ok, wonderful, now it is working correctly but how do I select multiple
> columns from table movies?
> Should I add another sub query?

Nope. You have now changed the problem definition, so scalars will not be a 
good fit. Blindly copy & paste them will not get you anywhere.

The key to success in your case is to access the movie_genre table only once, 
as selectively as possible. 

So, if, for a given genre you want 4 movies, you could try something along 
these lines:

selectgenre.code,
  movie.title
from  (
selectmovie_genre.movie_id,
  movie_genre.genre_id
from  movie_genre

where movie_genre.genre_id = 30

order by  movie_genre.id

limit 4
  )
asmovie_genre

join  movie
onmovie.id = movie_genre.movie_id

join  genre
ongenre.id = movie_genre.genre_id

Which gives you 4 rows:

Western|"26 Men" (1957)
Western|"A Man Called Shenandoah" (1965)
Western|"ABC Weekend Specials" (1977) {The Winged Colt (#1.5)}
Western|"Action in the Afternoon" (1953)





___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Please help me optimize this LEFT JOIN query.

2013-09-05 Thread Yuzem
Thanks but 4 rows is not what I am looking for.
I found a solution, concatenation:
SELECT genres.genres,
   (
SELECT  group_concat(movie,' ')
FROM
(
  SELECTmovies.movies||','||name movie
  FROM  movies

  JOIN  genres AS movie_genres
  ONmovies.movies = movie_genres.movies

  WHERE movie_genres.genres = genres.genres

  ORDER BY  movies.name

  LIMIT 3
)
)

FROM  genres
GROUP BY  genres
ORDER BY  genres.genres;

Any clue on why LEFT JOIN is so slow when used with genres but not with
larger tables?



--
View this message in context: 
http://sqlite.1065341.n5.nabble.com/Please-help-me-optimize-this-LEFT-JOIN-query-tp71000p71040.html
Sent from the SQLite mailing list archive at Nabble.com.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Please help me optimize this LEFT JOIN query.

2013-09-05 Thread Yuzem
Ok, wonderful, now it is working correctly but how do I select multiple
columns from table movies?
Should I add another sub query?
Example:

SELECT genres.genres,
   (
SELECT  substr(group_concat(name,' '),1,60)
FROM
(
  SELECTname
  FROM  movies

  JOIN  genres AS movie_genres
  ONmovies.movies = movie_genres.movies

  WHERE movie_genres.genres = genres.genres

  ORDER BY  movies.name

  LIMIT 6
)
),
(
SELECT  substr(group_concat(movies,' '),1,60)
FROM
(
  SELECTname
  FROM  movies

  JOIN  genres AS movie_genres
  ONmovies.movies = movie_genres.movies

  WHERE movie_genres.genres = genres.genres

  ORDER BY  movies.name

  LIMIT 6
)
)

FROM  genres
GROUP BY  genres
ORDER BY  genres.genres;

If I add another sub query it takes almost twice the time than using only
one sub query. It is still faster than before.

 
Petite Abeille-2 wrote
> Yeah… the inner join vs out join will produce a significantly different
> access plan.

Yes but if that's the case don't you think that the difference in time is a
bit much?
And why there is almost no difference between JOIN and LEFT JOIN when using
tasks and people which are larger tables?



--
View this message in context: 
http://sqlite.1065341.n5.nabble.com/Please-help-me-optimize-this-LEFT-JOIN-query-tp71000p71038.html
Sent from the SQLite mailing list archive at Nabble.com.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Please help me optimize this LEFT JOIN query.

2013-09-05 Thread Petite Abeille

On Sep 5, 2013, at 8:56 PM, Yuzem  wrote:

> SELECT genres FROM genres LEFT JOIN movies ON genres.movies = movies.movies
> GROUP BY genres ORDER BY genres;
> time: 2.475s
> 
> SELECT genres FROM genres JOIN movies ON genres.movies = movies.movies GROUP
> BY genres ORDER BY genres;
> time: 0.035s

Yeah… the inner join vs out join will produce a significantly different access 
plan.

For example, using a different schema:

(1) inner join
0|0|1|SCAN TABLE movie_genre USING COVERING INDEX sqlite_autoindex_movie_genre_1
0|1|0|SEARCH TABLE genre USING INTEGER PRIMARY KEY (rowid=?)

(2) outer join
0|0|0|SCAN TABLE genre USING COVERING INDEX sqlite_autoindex_genre_1
0|1|1|SCAN TABLE movie_genre USING COVERING INDEX sqlite_autoindex_movie_genre_1

In this case, the first plan is much better in terms of throughput as 
movie_genre is much larger table.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Please help me optimize this LEFT JOIN query.

2013-09-05 Thread Yuzem
I did some testing and found some strange results.

SELECT genres FROM genres LEFT JOIN movies ON genres.movies = movies.movies
GROUP BY genres ORDER BY genres;
time: 2.475s

SELECT genres FROM genres JOIN movies ON genres.movies = movies.movies GROUP
BY genres ORDER BY genres;
time: 0.035s

SELECT tasks FROM tasks LEFT JOIN people ON tasks.people = people.people
GROUP BY tasks ORDER BY tasks"
time: 0.164s

SELECT tasks FROM tasks JOIN people ON tasks.people = people.people GROUP BY
tasks ORDER BY tasks"
time: 0.163s

The strange thing is that tasks and people are much larger tables than
genres and movies:
SELECT count() from genres;
3998
SELECT count() from movies;
1529
SELECT count() from tasks;
24964
SELECT count() from people;
19626



--
View this message in context: 
http://sqlite.1065341.n5.nabble.com/Please-help-me-optimize-this-LEFT-JOIN-query-tp71000p71031.html
Sent from the SQLite mailing list archive at Nabble.com.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Please help me optimize this LEFT JOIN query.

2013-09-05 Thread Yuzem
I did an ANALYZE but I'm getting the same results.
I do have indexes:
CREATE TABLE movies (
  movies UNIQUE,
  name,
  icon_modified
);

CREATE TABLE genres (
  genres,
  movies,
  UNIQUE(genres,movies)
);

people has an index on people (people UNIQUE) and tasks is a view:
CREATE VIEW tasks AS
SELECT 'creators'  tasks,movies,people,'' details FROM creators
UNION ALL
SELECT 'directors' tasks,movies,people,'' details FROM directors
UNION ALL
SELECT 'writers'   tasks,movies,people,'' details FROM writers
UNION ALL
SELECT 'actors'tasks,movies,people,characters details FROM actors;

The tables creators, directors, etc, have similar indexes as genres:
CREATE TABLE directors (
  directors,
  people,
  UNIQUE(directors,people)
);



--
View this message in context: 
http://sqlite.1065341.n5.nabble.com/Please-help-me-optimize-this-LEFT-JOIN-query-tp71000p71033.html
Sent from the SQLite mailing list archive at Nabble.com.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Please help me optimize this LEFT JOIN query.

2013-09-05 Thread Simon Slavin

On 5 Sep 2013, at 7:56pm, Yuzem  wrote:

> I did some testing and found some strange results.

Please do an ANALYZE and try the same things again.

Also, do you have any indexes on those tables (apart from the primary keys, of 
course) ?

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Please help me optimize this LEFT JOIN query.

2013-09-05 Thread Petite Abeille

On Sep 4, 2013, at 4:21 PM, Yuzem  wrote:

> I want to construct genres icons and each icon must display 4 movies.

Assuming this is IMDB… what about a scalar subquery?

For example, assuming a slightly different schema from yours:

selectgenre.code as genre,
  (
select  group_concat( title )
from
(
  selectmovie.title as title
  from  movie

  join  movie_genre
  onmovie_genre.movie_id = movie.id

  where movie_genre.genre_id = genre.id

  order by  movie.title

  limit 4
) 
  ) as sample
from  genre

where genre.code = 'Western'

order by  genre.code

> Western|"26 Men" (1957),"A Man Called Shenandoah" (1965),"ABC Weekend 
> Specials" (1977) {The Winged Colt (#1.5)},"Action in the Afternoon" (1953)

CPU Time: user 0.083246 sys 0.000443

This is for:

select count( * ) from genre;
30

select count( * ) from movie;
2545331

select count( * ) from movie_genre;
1545196


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Please help me optimize this LEFT JOIN query.

2013-09-04 Thread Yuzem
Thanks Igor but now it is taking even more time: 3.139s
Any other idea?



--
View this message in context: 
http://sqlite.1065341.n5.nabble.com/Please-help-me-optimize-this-LEFT-JOIN-query-tp71000p71003.html
Sent from the SQLite mailing list archive at Nabble.com.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Please help me optimize this LEFT JOIN query.

2013-09-04 Thread Igor Tandetnik

On 9/4/2013 10:21 AM, Yuzem wrote:

SELECT genres name,count(genres.movies)
count,substr(group_concat(grouped.movies,' '),1,40)
src,substr(group_concat(grouped.icon_modified,' '),1,80) icon_modified FROM
genres LEFT JOIN (SELECT movies,icon_modified FROM movies WHERE
icon_modified != '') grouped on grouped.movies = genres.movies  GROUP BY
genres ORDER BY name"


Try

FROM genres LEFT JOIN movies grouped ON (genres.movies = movies.movies 
AND icon_modified != '')


The use of nested SELECT likely prevents SQLite from using the index on 
movies(movies).

--
Igor Tandetnik

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Please help me optimize this LEFT JOIN query.

2013-09-04 Thread Yuzem
I have 2 tables:
CREATE TABLE movies (
  movies UNIQUE,
  name,
  icon_modified
);

CREATE TABLE genres (
  genres,
  movies,
);

I want to construct genres icons and each icon must display 4 movies.
I need to get: genres from genres and movies,icon_modified from movies.
This is the query I have:
SELECT genres name,count(genres.movies)
count,substr(group_concat(grouped.movies,' '),1,40)
src,substr(group_concat(grouped.icon_modified,' '),1,80) icon_modified FROM
genres LEFT JOIN (SELECT movies,icon_modified FROM movies WHERE
icon_modified != '') grouped on grouped.movies = genres.movies  GROUP BY
genres ORDER BY name"

The query works correctly but it is very very slow, it takes 2.474s on my
machine, if I run the same query but using JOIN instead of LEFT JOIN it only
takes 0.039s

Is there any way of optimizing these query?
Thanks in advance.



--
View this message in context: 
http://sqlite.1065341.n5.nabble.com/Please-help-me-optimize-this-LEFT-JOIN-query-tp71000.html
Sent from the SQLite mailing list archive at Nabble.com.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users