Re: [SQL] Help with query involving aggregation and joining.

2003-02-23 Thread Josh Berkus
Eddie,

> My requirements involve several large tables. I have
> simplied scenario into the follow two fictional tables
> which describes the same requirements.

Keep in mind that a simplified example may not solve your real problem ...

> Basically I would like to display the latest
> submission for each course in a table as shown below,
> order by name of the courses.
>
> Query Results:
> ==
>  id | courseId |  name| submission
> ---
>  4  |  102 | Chemisty | 2002-02-22
>  3  |  104 | Maths| 2002-04-30
>  1  |  101 | Physics  | 2002-01-20

Easy:

SELECT id, courseid, name, max(submission) as submission
FROM history JOIN courses ON history.courseid = course.id
GROUP BY id, courseid, name
ORDER BY name

And as such, I suspect that your real case is more complicated than the above 
...

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly


Re: [SQL] [BUGS] 7.3 GROUP BY differs from 7.2

2003-02-23 Thread Josh Berkus
Guys,

SQL spec aside, thinking about this from a strictly implementation/user point 
of view:
(an keeping in mind that I think it's very important that we work out the 
spec-correct behaviour for 7.4 and/or 7.3.3)

The particular case that Dan has raised is an issue for four reasons:
1) It looks to a human like it *should* work, and I think given a long weekend 
of relational calculus someone (not me) could define the cases where it is OK 
as opposed to the cases (probably the majority) where it is not.
2) That syntax *did* work in previous versions of PostgreSQL.
3) That syntax will be accepted by some other SQL databases.
4) The error message is rather confusing and could cause a developer to spend 
an hour or more hunting for the wrong error.

I propose that, should we decide not to change the behaviour of the parser, 
that we do the following:

1) add the following to the FAQ or elsewhere:

Q.  I just got the message "ERROR:  Attribute unnamed_join.[column name] must 
be GROUPed or used in an aggregate function" and my GROUP BY query won't run, 
even though all of the columns are in the GROUP BY clause. This query may 
have worked in PostgreSQL 7.2, or on another SQL database.  What do I do?

A. You are probably qualifying a column name differently in the SELECT clause 
than in the GROUP BY clause, causing the parser to be confused about what you 
really mean.   For example, you may be referring to a column by its simple 
column name ("element_id") in the SELECT clause, and by its table-qualified 
name ("table1.element_id") in the GROUP BY clause, or you may be using an 
alias in one place but not the other.   Please make sure that all columns in 
the SELECT clause match *exactly* the columns in the GROUP BY clause.

2) That we add a warning in the 7.3 release notes about the breaking of 
backward compatibility.

Thoughts?

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [SQL] syntax question

2003-02-23 Thread Josh Berkus
James,

> but thats what:
> rec record
> select into rec id from table;
> return rec.id
>
> does
>
> my question was can i do this with a query built inside a string?

No.  That's what I was talking about.  You have to use the loop.


-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly


Re: [SQL] Help with query involving aggregation and joining.

2003-02-23 Thread Eddie Cheung

Hi,

I was very glad to see the replies from you guys this
morning.  The two suggested SQL queries did not return
the expected results, but will help me to explain the
problem I am facing further.

1) Josh suggested the following query. (I have made
minor changes by adding the table name to each field)

   SELECT history.id, history.courseId, course.name,
MAX(history.submission) AS submission
   FROM history JOIN course ON history.courseId =
course.Id
   GROUP BY history.id, history.courseId, course.name
   ORDER BY course.name;

The results returned are:
 id | courseid |   name| submission
+--+---+
  2 |  102 | Chemistry | 2002-02-17
  4 |  102 | Chemistry | 2002-02-22
  3 |  104 | Maths | 2002-04-30
  5 |  104 | Maths | 2002-03-15
  6 |  104 | Maths | 2002-01-21
  1 |  101 | Physics   | 2002-01-20

There are duplicate courses because the history.id
column has different values. The history.id cannot be
use in the GROUP BY clause. But it is one of the
displaying field, so I could not remove it from the
GROUP BY clause either.

2) Bruno suggested the following query:

  select distinct on (course.courseid)
  history.id, course.courseid, course.name,
history.submission
  from course natural join history
  order by course.courseid, history.submission desc;
  
I have not used NATURAL JOIN before, but from what I
know, it joins the columns with the same name. Since
the joining columns of History and Course have
different names, I have replace JOIN clause. Please
let me know if I have made a mistake.

The modified query is:
   SELECT DISTINCT ON (course.id) course.id,
history.id, course.name, history.submission 
   FROM history JOIN course ON history.courseId =
course.id 
   ORDER BY course.id, history.submission desc;

The results returned are :
 id  | id |   name| submission
-++---+
 101 |  1 | Physics   | 2002-01-20
 102 |  4 | Chemistry | 2002-02-22
 104 |  3 | Maths | 2002-04-30
 
The problem here is that the results are not ordered
by the submission date. If I sort by
"history.submission" first, I get
   ERROR: SELECT DISTINCT ON expressions must match
initial ORDER BY expressions.
Please note that I cannot select distinct on the
course.name either because it is NOT unique. The
original tables are much larger, and the only unique
column is the id.

I have included the queries to create the tables here.



CREATE TABLE course (
id integer,
name varchar(32),
   Constraint course_pkey Primary Key (id)
);

CREATE TABLE history (
id integer NOT NULL,
courseid integer REFERENCES course(id),
submission date,
Constraint history_pkey Primary Key (id)
);

INSERT INTO course (id,name) VALUES (101,'Physics');
INSERT INTO course (id,name) VALUES (102,'Chemistry');
INSERT INTO course (id,name) VALUES (103,'Biology');
INSERT INTO course (id,name) VALUES (104,'Maths');
INSERT INTO course (id,name) VALUES (105,'English');

INSERT INTO history (id,courseid,submission) VALUES
(1,101,'2002-01-20');
INSERT INTO history (id,courseid,submission) VALUES
(2,102,'2002-02-17');
INSERT INTO history (id,courseid,submission) VALUES
(3,104,'2002-04-30');
INSERT INTO history (id,courseid,submission) VALUES
(4,102,'2002-02-22');
INSERT INTO history (id,courseid,submission) VALUES
(5,104,'2002-03-15');
INSERT INTO history (id,courseid,submission) VALUES
(6,104,'2002-01-21');




Thanks for all your help.


Regards,
Eddie



--- Bruno Wolff III <[EMAIL PROTECTED]> wrote:
> On Sun, Feb 23, 2003 at 11:02:27 -0800,
>   Eddie Cheung <[EMAIL PROTECTED]> wrote:
> > 
> > HISTORY
> > ===
> > id | courseId  | submission
> > ---+---+-
> >  1 |  101  | 2002-01-20 
> >  2 |  102  | 2002-02-17
> >  3 |  104  | 2002-04-30
> >  4 |  102  | 2002-02-22
> >  5 |  104  | 2002-03-15
> >  6 |  104  | 2002-01-21
> >  
> >  
> > COURSE
> > == 
> >   id  | name
> > --+---
> >   101 | Physics
> >   102 | Chemistry
> >   103 | Biology
> >   104 | Maths
> >   105 | English
> > 
> > 
> > Basically I would like to display the latest
> > submission for each course in a table as shown
> below,
> > order by name of the courses.
> > 
> > Query Results:
> > ==
> >  id | courseId |  name| submission
> > ---
> >  4  |  102 | Chemisty | 2002-02-22
> >  3  |  104 | Maths| 2002-04-30
> >  1  |  101 | Physics  | 2002-01-20
> 
> I think you want to do something like:
> 
> select distinct on (course.courseid)
>   history.id, course.courseid, course.name,
> history.submission
>   from course natural join history
>   order by course.courseid, history.submission desc;
> 
> ---(end of
> broadcast)---
> TIP 1: subscribe and unsubscribe commands go to
[EMAIL PROTECTED]


___