> But, when I add another column on select, like, film_description, I get > the following error: > > "ERROR: column "film.description" must appear in the GROUP BY clause or > be used in an aggregate function" > > If I put that column on GROUP BY everything works ok. But I want > understant why do I need to do that. Can someone teach me, please?
The reason is pretty simple - GROUPing actually means "sorting into boxes by values in the columns after the GROUP BY keyword" (and then applying some aggregate functions to these boxes, as for example COUNT, AVG etc.) Besides these aggregates, you can select a column that 'constant' for each of the boxes, that is all the rows in that box have the same value in this column. That's the case of the first SQL query you've posted - you're grouping by 'film_id', thus all the rows in a box have the same value in this column. And thanks to this you can select the value in the SELECT. But in the second query, you'd like to select another column (directly, not through an aggregate function) - title. But there could be different values for each row in the box (PostgreSQL doesn't know that the ID uniquely identifies the title). For example imagine you would group by 'published_year' instead of the 'film_id' - in this case there would be many different movies in the same box, in which case it's impossible to select 'title' for all of them. There are two ways to solve this: 1) add the 'title' to the GROUP BY clause, thus all the rows in a box have the same value of 'title' (and postgresql knows about that) 2) use a subselect film_id, (SELECT title FROM film WHERE film_id = film_outer.film_id) AS title FROM film AS film_outer ... GROUP BY film_id; Tomas ---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match