Re: [sqlite] Please help me optimize this LEFT JOIN query.

2013-09-05 Thread Yuzem
Thanks Petite Abeille, I translated your code to this:
SELECT genres.genres,
   (
SELECT  substr(group_concat(name,' '),1,60)
FROM
(
  SELECTname
  FROM  movies

  JOIN  genres
  ONmovies.movies = genres.movies

  ORDER BY  movies.name

  LIMIT 6
)
) AS sample

FROM  genres
GROUP BY  genres
ORDER BY  genres.genres;
time: 0.028s

It is incredibly fast but it gives me the wrong result:
Action|American Graffiti American Graffiti American Graffiti Mulhol
Adventure|American Graffiti American Graffiti American Graffiti Mulhol
Animation|American Graffiti American Graffiti American Graffiti Mulhol
Biography|American Graffiti American Graffiti American Graffiti Mulhol
Etc...

Am I doing something wrong?



--
View this message in context: 
http://sqlite.1065341.n5.nabble.com/Please-help-me-optimize-this-LEFT-JOIN-query-tp71000p71035.html
Sent from the SQLite mailing list archive at Nabble.com.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Please help me optimize this LEFT JOIN query.

2013-09-05 Thread Yuzem
Petite Abeille-2 wrote
> Sure. But your conclusion is a most likely a red herring. The crux of the
> matter is that inner and outer join have a wildly different semantic. And
> therefore execution plan. That's all. Seems all very reasonable from an
> optimizer point of view.

But I have no conclusion, I acknowledge that inner and outer join are
different but I don't know why LEFT JOIN works very fast on larger tables
and very slow on smaller tables at least in those cases. I would like to
have a conclusion on that matter.



--
View this message in context: 
http://sqlite.1065341.n5.nabble.com/Please-help-me-optimize-this-LEFT-JOIN-query-tp71000p71042.html
Sent from the SQLite mailing list archive at Nabble.com.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Please help me optimize this LEFT JOIN query.

2013-09-05 Thread Yuzem
Thanks but 4 rows is not what I am looking for.
I found a solution, concatenation:
SELECT genres.genres,
   (
SELECT  group_concat(movie,' ')
FROM
(
  SELECTmovies.movies||','||name movie
  FROM  movies

  JOIN  genres AS movie_genres
  ONmovies.movies = movie_genres.movies

  WHERE movie_genres.genres = genres.genres

  ORDER BY  movies.name

  LIMIT 3
)
)

FROM  genres
GROUP BY  genres
ORDER BY  genres.genres;

Any clue on why LEFT JOIN is so slow when used with genres but not with
larger tables?



--
View this message in context: 
http://sqlite.1065341.n5.nabble.com/Please-help-me-optimize-this-LEFT-JOIN-query-tp71000p71040.html
Sent from the SQLite mailing list archive at Nabble.com.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Please help me optimize this LEFT JOIN query.

2013-09-05 Thread Yuzem
Ok, wonderful, now it is working correctly but how do I select multiple
columns from table movies?
Should I add another sub query?
Example:

SELECT genres.genres,
   (
SELECT  substr(group_concat(name,' '),1,60)
FROM
(
  SELECTname
  FROM  movies

  JOIN  genres AS movie_genres
  ONmovies.movies = movie_genres.movies

  WHERE movie_genres.genres = genres.genres

  ORDER BY  movies.name

  LIMIT 6
)
),
(
SELECT  substr(group_concat(movies,' '),1,60)
FROM
(
  SELECTname
  FROM  movies

  JOIN  genres AS movie_genres
  ONmovies.movies = movie_genres.movies

  WHERE movie_genres.genres = genres.genres

  ORDER BY  movies.name

  LIMIT 6
)
)

FROM  genres
GROUP BY  genres
ORDER BY  genres.genres;

If I add another sub query it takes almost twice the time than using only
one sub query. It is still faster than before.

 
Petite Abeille-2 wrote
> Yeah… the inner join vs out join will produce a significantly different
> access plan.

Yes but if that's the case don't you think that the difference in time is a
bit much?
And why there is almost no difference between JOIN and LEFT JOIN when using
tasks and people which are larger tables?



--
View this message in context: 
http://sqlite.1065341.n5.nabble.com/Please-help-me-optimize-this-LEFT-JOIN-query-tp71000p71038.html
Sent from the SQLite mailing list archive at Nabble.com.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Please help me optimize this LEFT JOIN query.

2013-09-05 Thread Yuzem
I did some testing and found some strange results.

SELECT genres FROM genres LEFT JOIN movies ON genres.movies = movies.movies
GROUP BY genres ORDER BY genres;
time: 2.475s

SELECT genres FROM genres JOIN movies ON genres.movies = movies.movies GROUP
BY genres ORDER BY genres;
time: 0.035s

SELECT tasks FROM tasks LEFT JOIN people ON tasks.people = people.people
GROUP BY tasks ORDER BY tasks"
time: 0.164s

SELECT tasks FROM tasks JOIN people ON tasks.people = people.people GROUP BY
tasks ORDER BY tasks"
time: 0.163s

The strange thing is that tasks and people are much larger tables than
genres and movies:
SELECT count() from genres;
3998
SELECT count() from movies;
1529
SELECT count() from tasks;
24964
SELECT count() from people;
19626



--
View this message in context: 
http://sqlite.1065341.n5.nabble.com/Please-help-me-optimize-this-LEFT-JOIN-query-tp71000p71031.html
Sent from the SQLite mailing list archive at Nabble.com.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Please help me optimize this LEFT JOIN query.

2013-09-05 Thread Yuzem
I did an ANALYZE but I'm getting the same results.
I do have indexes:
CREATE TABLE movies (
  movies UNIQUE,
  name,
  icon_modified
);

CREATE TABLE genres (
  genres,
  movies,
  UNIQUE(genres,movies)
);

people has an index on people (people UNIQUE) and tasks is a view:
CREATE VIEW tasks AS
SELECT 'creators'  tasks,movies,people,'' details FROM creators
UNION ALL
SELECT 'directors' tasks,movies,people,'' details FROM directors
UNION ALL
SELECT 'writers'   tasks,movies,people,'' details FROM writers
UNION ALL
SELECT 'actors'tasks,movies,people,characters details FROM actors;

The tables creators, directors, etc, have similar indexes as genres:
CREATE TABLE directors (
  directors,
  people,
  UNIQUE(directors,people)
);



--
View this message in context: 
http://sqlite.1065341.n5.nabble.com/Please-help-me-optimize-this-LEFT-JOIN-query-tp71000p71033.html
Sent from the SQLite mailing list archive at Nabble.com.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Please help me optimize this LEFT JOIN query.

2013-09-04 Thread Yuzem
Thanks Igor but now it is taking even more time: 3.139s
Any other idea?



--
View this message in context: 
http://sqlite.1065341.n5.nabble.com/Please-help-me-optimize-this-LEFT-JOIN-query-tp71000p71003.html
Sent from the SQLite mailing list archive at Nabble.com.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Please help me optimize this LEFT JOIN query.

2013-09-04 Thread Yuzem
I have 2 tables:
CREATE TABLE movies (
  movies UNIQUE,
  name,
  icon_modified
);

CREATE TABLE genres (
  genres,
  movies,
);

I want to construct genres icons and each icon must display 4 movies.
I need to get: genres from genres and movies,icon_modified from movies.
This is the query I have:
SELECT genres name,count(genres.movies)
count,substr(group_concat(grouped.movies,' '),1,40)
src,substr(group_concat(grouped.icon_modified,' '),1,80) icon_modified FROM
genres LEFT JOIN (SELECT movies,icon_modified FROM movies WHERE
icon_modified != '') grouped on grouped.movies = genres.movies  GROUP BY
genres ORDER BY name"

The query works correctly but it is very very slow, it takes 2.474s on my
machine, if I run the same query but using JOIN instead of LEFT JOIN it only
takes 0.039s

Is there any way of optimizing these query?
Thanks in advance.



--
View this message in context: 
http://sqlite.1065341.n5.nabble.com/Please-help-me-optimize-this-LEFT-JOIN-query-tp71000.html
Sent from the SQLite mailing list archive at Nabble.com.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] How can I improve this query?

2013-03-07 Thread Yuzem
Thank you very much, that worked perfectly, I have one more question.
Given the same schema, I can list all genres with one movie by genre:
SELECT genres, movies FROM genres GROUP BY genres;
Action|[movie id]
Adventure|[movie id]
Animation|[movie id]
Biography|[movie id]
Comedy|[movie id]
Crime|[movie id]
Documentary|[movie id]
etc...

I want to get a specifed number of movies by genre, for example 3 movies by
genres:
Action|[movie id]
Action|[movie id]
Action|[movie id]
Adventure|[movie id]
Adventure|[movie id]
Adventure|[movie id]
Animation|[movie id]
Animation|[movie id]
Animation|[movie id]
etc...

Is that possible?
Thanks again!



--
View this message in context: 
http://sqlite.1065341.n5.nabble.com/How-can-I-improve-this-query-tp67470p67486.html
Sent from the SQLite mailing list archive at Nabble.com.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] How can I improve this query?

2013-03-06 Thread Yuzem
Hello, I have the following tables:
CREATE TABLE movies (movies,name);
CREATE TABLE genres (movies,genres);

Every movie has many genres and every genre has many movies.
I want to list all genres but those who match a specified movie must be
marked.

For example for movie "tt1637725":
SELECT distinct genres,movies from genres WHERE movies='tt1637725'
UNION ALL
SELECT distinct genres,'' FROM genres WHERE genres NOT IN (SELECT distinct
genres FROM genres WHERE movies='tt1637725') group by genres order by
genres;

Action|
Adult|
Adventure|
Animation|
Biography|
Comedy|tt1637725
Crime|
Documentary|
Drama|
Family|
Fantasy|tt1637725
Film-Noir|
History|
Horror|
Music|
Musical|
Mystery|
Romance|
Sci-Fi|
Short|
Sport|
Thriller|
War|
Western|

Is there any way to improve that code?
Thanks in advance!



--
View this message in context: 
http://sqlite.1065341.n5.nabble.com/How-can-I-improve-this-query-tp67470.html
Sent from the SQLite mailing list archive at Nabble.com.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Speed up count(distinct col)

2011-02-17 Thread Yuzem


BareFeetWare-2 wrote:
> 
> Oh, I see, so the "complication" is that you would have to change the
> columns used in your bash script. That shouldn't be a big problem, but
> I'll leave the bash script to you.
> 
It isn't a big problem, the "complication" was to adapt all the tables and
inserts and column declarations only for testing.


BareFeetWare-2 wrote:
> 
> If for some reason rewriting the insert command in the bash script is
> insurmountable, you can just create a view in SQL to match the
> expectations of the bash script. That view can funnel each insert to the
> underlying SQL schema table columns using an "instead of insert" trigger.
> Let me know if you need more info on this.
> 
That's not necessary, I will adapt your code to my database schema.


BareFeetWare-2 wrote:
> 
> or you can change the delete trigger to remove statistics that drop to a
> zero count:
> 
> begin immediate
> ;
> drop trigger if exists "Movie People delete"
> ;
> create trigger "Movie People delete"
> on "Movie People"
> after delete
> begin
> insert or replace into "Capacity People Statistics" (Capacity_ID,
> People_ID, Count)
> select
>   old.Capacity_ID
> , old.People_ID
> , (   select coalesce(Count, 0) - 1 from "Capacity People Statistics"
>   where Capacity_ID = old.Capacity_ID and People_ID = 
> old.People_ID
>   )
> ;
> delete from "Capacity People Statistics"
> where Count = 0
> ;
> end
> ;
> commit
> ;
> 
Oh, I see, that makes a lot of sense, that was one of the problem I had,
this way I can delete directors without using a distinct, great!


BareFeetWare-2 wrote:
> 
> Great, I'm glad we finally got there :-)
> 
Thank you very much for all your help, I will try to adapt the code to my
schema, I will let you know if I meet any problem.
Thanks again!
-- 
View this message in context: 
http://old.nabble.com/Speed-up-count%28distinct-col%29-tp30864622p30954516.html
Sent from the SQLite mailing list archive at Nabble.com.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Speed up count(distinct col)

2011-02-15 Thread Yuzem


BareFeetWare-2 wrote:
> 
> Does IMDB allow use of their data this way? After my brief reading of
> their site, I thought they charge a $15k minimum per year for data.
> 
IMDb Pro offers additional information and services, all the information I
am grabbing is already available to the public and I am not publishing this
information, there are a lot of programs that do this.


BareFeetWare-2 wrote:
> 
>> The script is written in bash and I can give you the code but I think it
>> would be very complicated to adapt it to your schema.
> 
> Complicated? Why? In what format is the grabbed data (ie what tables,
> columns, rows)? It usually just takes an insert select to move data from
> one table's schema to another.
> 
Well, you would have to replace the insert commands, lets say that I have
table movies with columns id,title,year,plot
In the script I have to set the columns as variables:
id=tt12345
title="movie title"
year=2011
plot="this is the plot"

Ant then I call the command insert:
insert movies

The command insert knows the columns for each table, it goes column by
column, if the column is declared as a variable it is inserted so you would
have to replace all those commands with something like:
echo "INSERT INTO movies" etc...

Let me know if you are interested in the script, the script is included in 
https://launchpad.net/figuritas/+download the application , but I can post
and explain the relevant functions if you want.


BareFeetWare-2 wrote:
> 
> Well, yes, but it depends of the definition of "how many directors there
> are". The above counts how many directors there are in total, counting the
> same person for each movie they direct.
> 
> What you want, I think, however, is how many people there are who are
> directors (counting each person just once, even if they direct multiple
> movies), given by this:
> 
> select count(*) from "Capacity People Statistics"
> where Capacity_ID = (select ID from Capacity where Name = 'director')
> ;
> 
Ok then, yes I wanted to count directors counting each person just once, I
think that that code will not count much faster.


BareFeetWare-2 wrote:
> 
> Hopefully now that I've given you the query you actually needed, it now
> makes sense ;-)
> 
Yes it does, but then I don't need the Count column since I want to speed up
counting distinct directors but counting each person just once.


BareFeetWare-2 wrote:
> 
> No, I don't think you need more tables to achieve the above.
> 
Yes you can do it with the above but I want to make it very fast, with those
two additional tables I think it will be much faster.
-- 
View this message in context: 
http://old.nabble.com/Speed-up-count%28distinct-col%29-tp30864622p30935871.html
Sent from the SQLite mailing list archive at Nabble.com.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Speed up count(distinct col)

2011-02-12 Thread Yuzem


BareFeetWare-2 wrote:
> 
> What is your source for the data? If I have that, I can test my schema
> properly, rather than sitting here entering a pile of dummy data. I looked
> at IMDB, but they only seem to have a paid license download.
> 
I am grabbing the data from the each movie imdb webpage. The script is
written in bash and I can give you the code but I think it would be very
complicated to adapt it to your schema.


BareFeetWare-2 wrote:
> 
> No. Sorry, I should have included that. You count directors like this:
> 
> select sum(Count) from "Capacity People Statistics"
> where Capacity_ID = (select ID from Capacity where Name = 'director')
> ;
> 
Are you sure that this count how many directors there are?
If I understand it correctly (probably I don't) you have for example table
"Movie People":
1|director|1
2|director|1
3|director|2

There are 2 directors and then in "Capacity People Statistics" you should
have:
director|1|2
director|2|1

If you use the previous code:
select sum(Count) from "Capacity People Statistics"
where Capacity_ID = (select ID from Capacity where Name = 'director')
;

I think it will  return 3 but there are only 2 directors.


BareFeetWare-2 wrote:
> 
>> Another thing: I don't understand the purpose of the Count column in
>> table
>> "Capacity People Statistics"
> 
> It hopefully now makes sense with my correction and example selects above.
> 
Sorry, but I still don't understands it because I don't understand that you
can count directors that way.


BareFeetWare-2 wrote:
> 
> I expect the above to be about the same speed or faster (since part of the
> counting is already done) than separate tables, but far more flexible (eg
> no need to add a table to accommodate a new capacity), and better
> normalized.
> 
But using different tables provides an instant result, you can try it with
any table:
SELECT count(*) from table;

In the other hand you are right, it is less flexible.
A more flexible and faster approach would be a table with total directors,
writers, etc... but that is very complicated because I would have to count
distinct directors, writers, etc... in every insert.

Oh wait... I think I got it. I need two more tables:
Tables:
"Movie People" (movie_id capacity people_id)
"Capacity People" (capacity_id people_id)
"Capacity Count" (capacity count)

On every insert into "Movie People" I trigger an insert into "Capacity
People" and on every insert/delete from "Capacity People" I add/substract 1
from the corresponding capacity in "Capacity Count"

Then I can count directors with:
SELECT count FROM "Capacity Count" WHERE capacity = 'directors';

What do you think? (I didn't test it)


BareFeetWare-2 wrote:
> 
>> Another thing: I don't understand the purpose of the Count column in
>> table
>> "Capacity People Statistics"
> 
> It hopefully now makes sense with my correction and example selects above.
> 
Sorry, but I still don't understands it, of-course, this is because I don't
understand that you can count directors that way.
-- 
View this message in context: 
http://old.nabble.com/Speed-up-count%28distinct-col%29-tp30864622p30908962.html
Sent from the SQLite mailing list archive at Nabble.com.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Speed up count(distinct col)

2011-02-11 Thread Yuzem


BareFeetWare-2 wrote:
> 
> Oh, I see. I was counting the total number of each unique capacity
> (including director, writers), but you want the number of each unique
> (capacity, person). No problem. We'll just add a People column to the
> statistics, and change the uniqueness constraint.
> 
> Replace my earlier "Capacity Statistics" and triggers with this:
> 

That should work, if I understand it correctly now I can count directors
with:
SELECT count(*) FROM "Capacity People Statistics" WHERE ID = 'directors';
Instead of:
SELECT count(distinct People_ID) FROM "Movie People" WHERE Capacity_ID =
'directors';

It is similar to using different tables with only the IDs.
Instead of using a table for directors another for writers, etc... this is
specified in a column, it is slower but it uses only one table.
If I use different tables the result is instant, I don't know if it will be
much faster to count from "Capacity People Statistics" than counting from
"Movie People".

Another thing: I don't understand the purpose of the Count column in table
"Capacity People Statistics"


Simon Slavin-3 wrote:
> 
> So if I deleted one record and created another you wouldn't spot it ?
> 
Yes, any change to the database will result in a complete refresh of the
cache.


Simon Slavin-3 wrote:
> 
> Your process is trustworthy only when you are certain that the database
> file is not currently open.  If there's a chance that some application may
> be modifying the file when you check these things then the results you get
> may not be up-to-date. 
> 
Yes but what can I do about it, the same happens if I open a text document
that it is already open in another text editor.
I could check if there is a journal file but I don't know if it is necessary
since I am not keeping any connection open, I try to make the connections as
short as possible and if I get some data that it isn't up to date I don't
see much problem.

Something that I just realized, comparing INTEGER columns doesn't seems
faster than comparing TEXT columns, this:
SELECT count(distinct ROWID) FROM keywords;
Result: 83513
Is slower than this:
SELECT count(distinct keywords) FROM keywords;
Result: 17321
-- 
View this message in context: 
http://old.nabble.com/Speed-up-count%28distinct-col%29-tp30864622p30900999.html
Sent from the SQLite mailing list archive at Nabble.com.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Speed up count(distinct col)

2011-02-10 Thread Yuzem


Simon Slavin-3 wrote:
> 
> By looking at the file on disk ?  Are you taking into account the journal
> file ? 
> 
Yes, I do all the counts and save the data to a file and then if the file is
newer then the database I use the file else I count again.
No, I am not taking the journal file into account, I don't know what you
mean by that.


BareFeetWare-2 wrote:
> 
> Does tis answer your needs? If not, please explain further, but it will
> probably only require modifying a constraint in the schema I proposed,
> rather than denormalizing or partitioning. 
> 
I will try to explain it.
I have the table "movies people" with columns "movie_ID, capacity_ID,
people_ID"
I understand that all rows are unique but people_ID isn't unique.
For example lets say I have the following data:
1|director|1
2|director|2
3|director|1

In this example the total count for directors is 2, I have two distinct
directors.
In the table "Capacity Statistics" I will have:
director|2

The triggers you made add/subtract 1 from "Capacity Statistics" on
insert/delete on "movies people"
What happens if I add the following to "movies people"?
4|director|2

The trigger should add 1 to "Capacity Statistics":
director|3

But there are still 2 directors:
1|director|1
2|director|2
3|director|1
4|director|2


BareFeetWare-2 wrote:
> 
> Erm ... there are also movies which have more than one person directing. 
> You need to copy across the rowid from the MoviePeople table, and delete
> just based on that particular record. 
> 
Yes, of-course.
-- 
View this message in context: 
http://old.nabble.com/Speed-up-count%28distinct-col%29-tp30864622p30898156.html
Sent from the SQLite mailing list archive at Nabble.com.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Speed up count(distinct col)

2011-02-10 Thread Yuzem


BareFeetWare-2 wrote:
> 
> In that case, you should cache the counts in a separate table or two. That
> has a negligible overhead when you add a movie (which is infrequent), and
> basically no overhead when viewing (which is frequent).
I am doing that but in the application level, the down side is that I am
updating the cache on any change, I check the modified time of the database.


BareFeetWare-2 wrote:
> 
> Do that with triggers (which are a good thing in this context), to change
> the relevant count when an insert, delete or update occurs in the "Movie
> People" table. Something like:
> 
I think, not sure, that there will be a problem with that code.
I am not an expert so I had to stare that code some time to understand it.
If I am correct you are adding and subtracting 1 on every insert or delete.

The problem is that an insert may not be unique so before augmenting the
counter you have to check if it is unique and therefore you would have to
run a count(distinct col) on every insert which would be overkill.

That's the beauty of using a additional table. I make the column unique in
the the extra table and then on any insert in "movies people" I insert in
for example the directors table and any duplicate will be automatically
rejected.

The same problem exist on delete, the deleted entry may not be unique.

For example:
movie1|director|people1
movie2|director|people1

-- 
View this message in context: 
http://old.nabble.com/Speed-up-count%28distinct-col%29-tp30864622p30897526.html
Sent from the SQLite mailing list archive at Nabble.com.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Speed up count(distinct col)

2011-02-10 Thread Yuzem


Simon Slavin-3 wrote:
> 
> How much slower.  Did you make an index SQLite could use for that query ?
> 
Using distinct isn't slow, it is what I would expect but count(*) is
incredibly fast, it is instantaneous no matter how large is the table.
Yes, I tried with an index.


BareFeetWare-2 wrote:
> 
> Or you can try this:
> 
> select count(*) from People
> where ID in
> ( select People_ID from "Movie People" where Capacity_ID =
> (select ID from Capacity where Name = 'director')
> );
> 
> But I expect you'll get the same performance. 
> 
Yes, almost the same performance.


BareFeetWare-2 wrote:
> 
> You don't want to denormalize and have separate director and writer tables
> etc. That will get ugly. 
> 
That's what I thought.


Petite Abeille-2 wrote:
> 
> Right... looking at something like the casting information in IMdb (actor,
> director, writer, etc), there are about 25M movie + role + person
> combinations. Which, while not huge, starts to get taxing when queried on
> a lowly laptop.
> 
> For example, out of these 25M rows, about 1.2M represent directors (~217K)
> in movies (~1M). 
> 

Yes, in my application I have in the sidebar all those sections (movies,
years, tags, keywords, actors, directors, writers, etc...) and I count each
one, how many movies, how many years, etc...
It isn't very slow if I update only one item but the problem is when I have
to update the entire list, this uses a lot of cpu and takes some time and I
have to do it every time a movie is added.
http://3.bp.blogspot.com/_EbNPUgfUDXs/TIpif4U1III/A2A/YFSGOAdpOGA/s1600/nested-brosers.jpg
Here there is an old screenshot from the sidebar. 


Petite Abeille-2 wrote:
> 
> Assuming a movie_cast table [1] and a cast dimension [2], you could record
> the distinct count for each cast in the dimension once, and store it. That
> way you don't have to recompute it over and over.
> 
Yes, I am caching all counts in the application level but I have to update
it every time the database changes so I have to update it at least every
time I add a movie.


Petite Abeille-2 wrote:
> 
> Alternatively, you could indeed partition that bulky movie_cast table by
> its cast type, which will in effect reduce the search space to at most ~9M
> (movies by actors, ~1.3M movies, ~1.3M actors). 
> 
> That said, even partitioning will not help you much here as you still have
> to search through ~9M records to figure out how many distinct actors you
> have.
> 
Actually, I already have different tables for directors, writers, etc...


Petite Abeille-2 wrote:
> 
> So... perhaps best to cheat :)
> 
> One way to cheat is to precompute the answer by, for example, adding a
> is_* flag on your person table:
> 
> update person set is_actor = ( select count( distinct person_id ) from
> movie_actor where movie_actor.person_id = person.id );
> 
> The person table is much smaller (~3M) and easier to query in bulk:
> 
> explain query plan
> select count( * ) from person where is_actor = 1;
> 0|0|0|SEARCH TABLE person USING COVERING INDEX person_is_actor
> (is_actor=?) (~1596808 rows)
> 
> 1310119
> CPU Time: user 0.256286 sys 0.000257
> 
> There you go :)
> 
That's a good idea, I wonder if it isn't better to use a different table to
store only the ids for the directors, etc, as I said before.
I would end up with these tables:
movies 
people
movies_directors (movieId, peopleId)
movies_writers (movieId, peopleId)
directors (peopleId)
writers (peopleId)

Then I can count directors with:
SELECT count(*) FROM directors;


Petite Abeille-2 wrote:
> 
> This is where I wish SQLite could have bitmap indices:
> http://en.wikipedia.org/wiki/Bitmap_index
> 
That seems what I need.
-- 
View this message in context: 
http://old.nabble.com/Speed-up-count%28distinct-col%29-tp30864622p30897078.html
Sent from the SQLite mailing list archive at Nabble.com.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Speed up count(distinct col)

2011-02-10 Thread Yuzem


BareFeetWare-2 wrote:
> 
> Then you can count the directors like this:
> 
> select count(distinct People_ID) from "Movie People" join Capacity on
> "Movie People".Capacity_ID = Capacity.ID where Capacity.Name = 'director';
> 
> or:
> 
> select count(distinct People_ID) from "Movie People" where Capacity_ID =
> (select ID from Capacity where Name = 'director');
> 
> or you can create a view [...]
> 

Thanks you but what I want to do is to count without using count(distinct
col) because it is much slower than count(*).
In the previous examples about tags I can do it, I have "movies" and "tags"
related by "movies_tags"
and I can do:
SELECT count(*) FROM tags;
I want to know if there is any standard approach to do that.
The only thing I can think of is to have additional tables for the ids of
all directors, all writers, etc...
Tables "movies", "people", "capacity" and then tables "directors",
"writers", etc.. with only the IDs so I can count using count(*) which is
super fast.
-- 
View this message in context: 
http://old.nabble.com/Speed-up-count%28distinct-col%29-tp30864622p30893712.html
Sent from the SQLite mailing list archive at Nabble.com.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] EXT : Speed up count(distinct col)

2011-02-10 Thread Yuzem

Hi, thanks a lot for the helpful replies.
Sorry to bother again, but there is still something that hasn't been
answered.

Simon Slavin-3 has addressed my question but not exactly what I was asking.
Suppose I have two tables "movies" and "people" and other tables to relate
both tables: "directors", "writers", etc...

Simon Slavin-3 told me to create one single table to relate "movies" and
"people":
MoviePeople: id, movie, person, capacity

But with that solution I can't perform the fast count(*) that I want.
I would have to do:
SELECT count(distinct person) FROM MoviePeople WHERE capacity = "director";

I want to know if there is any standard solution for normalizing this
database so I can do count(*) to count all directors or writers...

Thanks in advance for your help and patience.
-- 
View this message in context: 
http://old.nabble.com/Speed-up-count%28distinct-col%29-tp30864622p30892423.html
Sent from the SQLite mailing list archive at Nabble.com.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] EXT : Speed up count(distinct col)

2011-02-07 Thread Yuzem

Ok, thanks all for your answers, I guest I will have to normalize the
database and explore the foreign key feature.

One more thing:
I have the tables "movies" and "people"
Those two tables are related by tables "directors", "writers", etc...
movies: id, title
people: id, name

directors: movieId, peopleId
writers: movieId, peopleId
etc...

How can I normalize that so I can count directors and writers using
count(*)?
Should I create another table for "directors" and another for "writers",
etc... with only the ids?
I will end up having:
movies > movies_directors < (directors people)

Thanks again!
-- 
View this message in context: 
http://old.nabble.com/Speed-up-count%28distinct-col%29-tp30864622p30868907.html
Sent from the SQLite mailing list archive at Nabble.com.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Speed up count(distinct col)

2011-02-07 Thread Yuzem


Petite Abeille-2 wrote:
> 
> Hey... sounds like IMdb :P
> 
Yes, I'm coding a  http://yuzem.blogspot.com/p/figuritas-screenshots.html
movie manager  that grabs the info from imdb.


Petite Abeille-2 wrote:
> 
> In any case, as mentioned by Michael Black, you might benefit greatly by
> normalizing your table and indexing the foreign key: 
> 
> http://dev.alt.textdrive.com/browser/IMDB/IMDB.ddl#L401
> 

I thought sqlite didn't handle foreign keys correctly, I am using triggers
to automatically delete tags when a movie gets deleted.
If I use a foreign key will it automatically remove the tag if the movie
gets deleted?
Anyway, to use integers in the "tags" table is very complicated because I
will have to assign the corresponding number to each tag that I insert and I
have to insert lots of keywords for every movie.
Does sqlite has any function to convert a text string into an unique number?

-- 
View this message in context: 
http://old.nabble.com/Speed-up-count%28distinct-col%29-tp30864622p30867411.html
Sent from the SQLite mailing list archive at Nabble.com.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] EXT : Speed up count(distinct col)

2011-02-07 Thread Yuzem



Black, Michael (IS) wrote:
> 
> Test#1
> create index tags_index on tags(tag);
> You should have an index for any fields you query on like this.
> 

Thanks Michael but I don't see any speed improvement:
create index test on tags(tag);
select count(distinct tag) from tags;

This is much faster:
select count(*) from tags;

Am I doing something wrong?
-- 
View this message in context: 
http://old.nabble.com/Speed-up-count%28distinct-col%29-tp30864622p30867275.html
Sent from the SQLite mailing list archive at Nabble.com.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Speed up count(distinct col)

2011-02-07 Thread Yuzem

Hi.
I have this tables:
CREATE TABLE movies(movie_id INTEGER, title TEXT, unique(movie_id));
CREATE TABLE tags(movie_id INTEGER, tag TEXT, unique(movie_id,tag));

I have many movies by tag and many tables like "tags" (keywords, countries,
languages, genres, etc..)
I can count different movies very fast with:
SELECT count(*) FROM movies;

But the other tables are much slower depending on the size of the table:
SELECT count(distinct tag) FROM tags;

My solution is to create an additional table "count_tags" and then every
time a tag is added to table "tags" a trigger adds the tag to "count_tags",
I need also another trigger to remove the tag, then I can do:
SELECT count(*) FROM count_tags;

This solution implies one additional table and two triggers by each table.

I wanted to know if there is a simpler solution, maybe by using indexes.
Thanks in advance.
-- 
View this message in context: 
http://old.nabble.com/Speed-up-count%28distinct-col%29-tp30864622p30864622.html
Sent from the SQLite mailing list archive at Nabble.com.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] How do I get expression from column (smart folder orfilter implementation)

2010-11-26 Thread Yuzem

Thanks for the answer.
I have another problem:
I decided to use a view to do this, I will have to drop/create the view
every time filters is updated but I don't know how to use multiple selects.
I want to do something like this:

CREATE VIEW movies_filters AS SELECT 'rated',movie FROM movies WHERE rating
!= '' AND SELECT 'tagged',movie FROM movies WHERE movie in (SELECT movie
FROM tags);
-- 
View this message in context: 
http://old.nabble.com/How-do-I-get-expression-from-column-%28smart-folder-or-filter-implementation%29-tp30295959p30308922.html
Sent from the SQLite mailing list archive at Nabble.com.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] How do I get expression from column (smart folder or filter implementation)

2010-11-24 Thread Yuzem

Hello,
Suppose I have this tables:
movies: movie,name,rating
tags: movie,tag
filters: filter,expression

In filters I have:
rated,rating != ''
tagged,movie in (select movie from tags)

Now I want to create a view to relate movies and filters exactly like the
tags table:
movies_filters: movie,filter

Something like this:
SELECT 'rated',movie FROM movies WHERE rating != '';
SELECT 'tagged',movie FROM movies WHERE movie in (SELECT movie FROM tags);

But from the "filters" table (doesn't work):
SELECT filter,movie FROM filters LEFT JOIN movies WHERE movie IN (SELECT
movie FROM movie WHERE expression);
-- 
View this message in context: 
http://old.nabble.com/How-do-I-get-expression-from-column-%28smart-folder-or-filter-implementation%29-tp30295959p30295959.html
Sent from the SQLite mailing list archive at Nabble.com.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] rowid of the current insert

2009-12-14 Thread Yuzem


Tim Romano wrote:
> 
> You should keep your id and the imdbid in separate columns, because you 
> can then insert a title even if IMDB does not have it yet.
> 

Ok, I have thought about this and it will be probably simpler to have two
separated ids than using a trigger.
Now, there are some problems:

When I grab the movie info from imdb there is some data that goes in the
movie table that is grabbed after than for example the actors. the insert on
the table movies happens at the end.
I guess there is no option than changing the application to insert the movie
first but there is still another problem...

I have this tables:
CREATE TABLE movies(id integer,imb_id integer,title
text,unique(imdb_id),PRIMARY KEY (id))
CREATE TABLE directors(id integer,name_id integer,unique(id,name_id))
CREATE TABLE names(name_id integer,name text,bio
text,unique(imdb_name_id),PRIMARY KEY (name_id))

The table movies holds info about movies and the table names holds info
about persons, the table directors relate both tables.

The problem is that to insert something in the table directors I must use
last_insert_rowid() after inserting the movie to get the rowid of the movie
but I also need the rowid for the director.
If I insert in names first I can't get the movie id and vice-versa.

I need something like: last_insert_rowid(movies) and
last_insert_rowid(names)

Any idea on how to solve this problem?
-- 
View this message in context: 
http://old.nabble.com/rowid-of-the-current-insert-tp26725353p26779934.html
Sent from the SQLite mailing list archive at Nabble.com.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] rowid of the current insert

2009-12-10 Thread Yuzem

CREATE TABLE movies(id integer,title text,unique(id))

Generally I have an unique id from imdb but some times the movie isn't
available.
I understand that there is a column called rowid that is the primary key.
I would like to insert the rowid on the id column when I don't have an id
for the movie.

Example:
insert into movies values(rowid,'title1');

How can I do that?


-- 
View this message in context: 
http://old.nabble.com/rowid-of-the-current-insert-tp26725353p26725353.html
Sent from the SQLite mailing list archive at Nabble.com.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Multiples natural left joins problem

2009-12-06 Thread Yuzem

Is this a bug in new versions that will be fixed or it is the new behavior?

-- 
View this message in context: 
http://old.nabble.com/Multiples-natural-left-joins-problem-tp26627140p2730.html
Sent from the SQLite mailing list archive at Nabble.com.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Multiples natural left joins problem

2009-12-04 Thread Yuzem




> You have not specified your version, but I can repeat your
> difficulties when using version 3.6.19 and 3.6.20
> 
I am using version 3.6.16



> You can get 'correct' results if you modify your query to
> SELECT movies.id, title, files, icon_modified, icon_width
> FROM ( movies natural left join files) as movies natural left join
> icons_movies
> LIMIT 25;
> 
This gives me exactly the same result as the other.
-- 
View this message in context: 
http://old.nabble.com/Multiples-natural-left-joins-problem-tp26627140p26645257.html
Sent from the SQLite mailing list archive at Nabble.com.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Multiples natural left joins problem

2009-12-04 Thread Yuzem



> What query are you trying with the above tables, and how do the
> results differ from what you expect?
> 
This is the query:
SELECT movies.id,title,files,icon_modified,icon_width
FROM (movies natural left join files) natural left join icons_movies
LIMIT 25 ; 

The difference with your last query is that I have to specify the tables or
it says "ambiguous column name" and the problem with the result is that if a
movie doesn't have an entry in the table "files" it doesn't show the icon
properties.
-- 
View this message in context: 
http://old.nabble.com/Multiples-natural-left-joins-problem-tp26627140p26635883.html
Sent from the SQLite mailing list archive at Nabble.com.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Multiples natural left joins problem

2009-12-03 Thread Yuzem



SimonDavies wrote:
> 
> sqlite> create table movies( id integer primary key, title text );
> sqlite> insert into movies( title ) values( 'movie1' );
> sqlite> insert into movies( title ) values( 'movie2' );
> sqlite> insert into movies( title ) values( 'movie3' );
> sqlite>
> sqlite> create table user( id integer primary key, rating text );
> sqlite> insert into user( rating ) values( 'rating1' );
> sqlite> insert into user( rating ) values( 'rating2' );
> sqlite>
> sqlite> create table tag( id integer primary key, tag text );
> sqlite> insert into tag( tag ) values( 'tag1' );
> sqlite> insert into tag( tag ) values( 'tag2' );
> sqlite> insert into tag( tag ) values( 'tag3' );
> sqlite>
> sqlite> SELECT a.id,a.title,a.rating,tag.tag FROM (movies natural left
> join user
> ) as a natural left join tag;
> 1|movie1|rating1|tag1
> 2|movie2|rating2|tag2
> 3|movie3||tag3
> sqlite>
> 
> Regards,
> Simon
> 

Mine is a little more complex actually:
CREATE TABLE movies(id INTEGER,year INTEGER,rating INTEGER,votes
INTEGER,runtime INTEGER,iconWidth INTEGER,iconHeight INTEGER,iconAlpha
INTEGER,iconModified INTEGER,title TEXT,type TEXT,plot TEXT,color
TEXT,unique(id))

CREATE TABLE files(id INTEGER,modified INTEGER,size INTEGER,files
TEXT,unique(files))

CREATE TABLE icons_movies(icon_width integer,icon_height integer,icon_alpha
integer,icon_modified integer,id text,unique(id))

Movies with no files will not display the icons. Don't know why. Maybe
because id is text in icons and integer in the others? But it joins, just,
not as expected...


SimonDavies wrote:
> 
> You need to read up on the different types of JOIN that SQL allows.  No
> reason to use a LEFT JOIN if you want a RIGHT JOIN, an INNER JOIN or a
> CROSS JOIN. 
> 
It says that right join is not supported and inner and cross join don't give
the expected result, they duplicate movies if there are more than one tag
for every movie.
-- 
View this message in context: 
http://old.nabble.com/Multiples-natural-left-joins-problem-tp26627140p26630530.html
Sent from the SQLite mailing list archive at Nabble.com.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Multiples natural left joins problem

2009-12-03 Thread Yuzem

Thanks both for the replies.

Pavel Ivanov-2 wrote:
> 
> If using "natural left join" is not a requirement for you then this
> works as you expect it:
> select movies.id, title, rating, tag
>  from movies left join user on movies.id = user.id
> left join tag on movies.id = tag.id;
> 
Ok, this works and it is fast but isn't any way to make this a little
simpler because the real query involves more tables and more columns and it
becomes a mess. I tried with using(id) but it is the same problem than with
natural left join.


SimonDavies wrote:
> 
> SELECT a.id,a.title,a.rating,tag.tag FROM (movies natural left join
> user) as a natural left join tag;
> 
It doesn't work, it says: no such column a.id
If I use movies.id I have the same problem as before: movies with no rating
don't get any tags.

-- 
View this message in context: 
http://old.nabble.com/Multiples-natural-left-joins-problem-tp26627140p26629866.html
Sent from the SQLite mailing list archive at Nabble.com.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Multiples natural left joins problem

2009-12-03 Thread Yuzem

Lets say I have the following tables:
movies: id,title
user: id,rating
tag:   id,tag

I want to left join all three tables:
SELECT movies.id,movies.title,user.rating,tag.tag
FROM movies natural left join user natural left join tag

The problem with this besides having to specify the tables (instead of
doing: select id,title,rating,tag) is that if a movie doesn't have any
rating it will not retrieve any tag neither. The table tag is joined to the
table user not to the main table movies

The following works as expected but it is terribly slow:
SELECT id,title,rating,tag
FROM movies natural left join (movies natural left join user) natural left
join (movies natural left join tag)

Is there any way to get the last result with an acceptable speed?
-- 
View this message in context: 
http://old.nabble.com/Multiples-natural-left-joins-problem-tp26627140p26627140.html
Sent from the SQLite mailing list archive at Nabble.com.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Strange problem with different sqlite versions accessing the same database

2009-11-10 Thread Yuzem

Ok, I have replaced the older version and now I have the same sqlite version
on both computers but the problem persist. Changes don't get saved on the
computer that access the database trough the network.

Any idea?
-- 
View this message in context: 
http://old.nabble.com/Strange-problem-with-different-sqlite-versions-accessing-the-same-database-tp26257575p26294299.html
Sent from the SQLite mailing list archive at Nabble.com.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Strange problem with different sqlite versions accessing the same database

2009-11-08 Thread Yuzem

I have one database on a shared folder across the network.
The database is accessed by two different computers with different sqlite
versions.
There was no problem until I updated the OS on one of the computer.
Now I have:
1: Ubuntu Hardy - sqlite3 3.4.2
2: Ubuntu Karmic - sqlite3 3.6.16

The newest version works correctly but the older doesn't save any changes.
If I do an update and then a select in the same connection I can see the
changes but they don't get saved.

Some times when accessing the database with one version the database was
different than when accessing with the other, and I know that it was the
same database, I even check the size. After some time, the database goes
back to be the same for the two versions.

I have no idea why this is happening...
Any help is welcome and sorry for my english.
-- 
View this message in context: 
http://old.nabble.com/Strange-problem-with-different-sqlite-versions-accessing-the-same-database-tp26257575p26257575.html
Sent from the SQLite mailing list archive at Nabble.com.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] How can I specify that a column is equal to another?

2009-07-01 Thread Yuzem

I have this:
select title,my_rating
from movies left join  user on movies.id = user.id
where id = 'tt0426459'
The result:
ambiguous column name: id

I could use:
select movies.id ids,title,my_rating
from movies left join  user on movies.id = user.id
where ids = 'tt0426459'

but I don't want to select the id

Another solution:
where movies.id = 'tt0426459'

Is there any way to specify that movies.id  is equal to user.id so I can use
just id in my query?
Thanks in advance!
-- 
View this message in context: 
http://www.nabble.com/How-can-I-specify-that-a-column-is-equal-to-another--tp24292794p24292794.html
Sent from the SQLite mailing list archive at Nabble.com.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] I need help with very complex queries

2009-03-10 Thread Yuzem


Jim Wilcoxson wrote:
> 
> For example, in your query you need to add something like:
> and genres.genre = 'drama'.
Yes but if I add that I neither get any result:

sqlite3 movies.db "select movies.id,title,year from
movies,genres,countries,languages,keywords,tags where movies.id = genres.id
and movies.id = countries.id and movies.id = languages.id and movies.id =
keywords.id and movies.id = tags.id and genres like 'horror'"

The same happens if I use genres.genres instead of only genres.


Jim Wilcoxson wrote:
> 
> For this query, only the movies and
> genres tables are needed because you aren't selecting any fields from
> the other tables, from what I can tell.
Yes I know but shouldn't I get all movies with that? I'm getting none...

-- 
View this message in context: 
http://www.nabble.com/I-need-help-with-very-complex-queries-tp22446301p22447156.html
Sent from the SQLite mailing list archive at Nabble.com.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Sqlite question: group by column with multiple tags?

2009-02-24 Thread Yuzem



P Kishor-3 wrote:
> Google for "normalizing a database" and then read up on it. It will help. 
> 

Ok, thanks, thats the solution. Now there is another problem that arise.
Lets say I have a table called movies with 3 columns (id, titles, keywords):

1|title1|keyword1
1|title1|keyword2
1|title1|keyword3
2|title2|keyword1
2|title2|keyword2

The id column is unique, the same from imdb so if a want to add twice the
same movie it doesn't let me.

After normalization I have two tables (movies (id, titles) and keywords (id,
keywords)):
movies:
1|title1
2|title2
keywords:
1|keyword1
1|keyword2
1|keyword3
2|keyword1
2|keyword2

How do I prevent inserting the same keyword for the same movie?
-- 
View this message in context: 
http://www.nabble.com/Sqlite-question%3A-group-by-column-with-multiple-tags--tp22153722p22180987.html
Sent from the SQLite mailing list archive at Nabble.com.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Sqlite question: group by column with multiple tags?

2009-02-23 Thread Yuzem

Thanks for the answer.

Igor Tandetnik wrote:
> This monstrosity gives the correct answer in your specific example, but 
> it relies on there being exactly two tags per folder.
It can be any number of tags per folder.

Igor Tandetnik wrote:
> 
> Consider normalizing your database. Split into two tables - folders and 
> tags - with a one-to-many relationship between them. Then the solution 
> would be trivial.
How can I do that considering that it can be any numbers of tags?


Igor Tandetnik wrote:
> 
> select
> (select count(*) from (select distinct folder from t1)),
> (select count(*) from (
> select replace(rtrim(tags,'abcdefghijklmnopqrstuvwxyz'),',','') 
> from t1
> union
> select replace(ltrim(tags,'abcdefghijklmnopqrstuvwxyz'),',','') 
> from t1));
> 
> Same caveat applies.
Ok, this answer my question at least for normal columns. Thanks.

How should I do to store tags in my database and been able to count them? Do
you know how other programs like firefox do this?
-- 
View this message in context: 
http://www.nabble.com/Sqlite-question%3A-group-by-column-with-multiple-tags--tp22153722p22159627.html
Sent from the SQLite mailing list archive at Nabble.com.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Sqlite question: group by column with multiple tags?

2009-02-22 Thread Yuzem

Suppose that have 2 columns: folders and tags:
~/Music|classic,rock
~/Music|classic,rock
~/Pictures|art,photos
~/Pictures|art,photos
~/Pictures|art,photos

To know the folder count I do:
sqlite3 test.db  "select folder, count(folders) from t1 group by folder"

Returns:
~/Music|2
~/Pictures|3

How can I do the same for tags using only sqlite to get this:
art|3
classic|2
photos|3
rock|2

Another question, is there any way to get max count for all grouped columns
in one single consult?
It should return something like this:
2|4

Meaning that the first column has 2 unique values (~/Music and ~/Pictures)
and the second column has 4 (art, classic. photos and rock)

Many thanks in advance! 
-- 
View this message in context: 
http://www.nabble.com/Sqlite-question%3A-group-by-column-with-multiple-tags--tp22153722p22153722.html
Sent from the SQLite mailing list archive at Nabble.com.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users