Following up my own question again, I've realised my error and solved my
problem - in the interests of completing this thread, the working query
(two versions) can be found below...

On Sun, 2006-04-02 at 18:00 +1000, Ross Johnson wrote:
> On Sun, 2006-04-02 at 16:43 +1000, Ross Johnson wrote:
> > Hi,
> > 
> > I'm relatively stretched when it comes to SQL but I'm getting there. I
> > have a question though:
> > 
> > Thanks to various web sites I've succeeded in creating simple cross
> > tables in PostgreSQL. I'm now trying more complicated examples
> > (migrating queries from MS Access to PostgreSQL). I'm stuck on getting
> > grouping to work where the selection element isn't a real table field,
> > where it is generated by a CASE statement. Here's a full actual SELECT
> > statement, that works but isn't quite what I need, followed by the first
> > few rows of output:
> > 
> > SELECT DISTINCT
> >     CASE
> >             WHEN lower(c."Order") = 'coleoptera' THEN 5
> >             WHEN lower(c."Order") = 'trichoptera' THEN 8
> >             WHEN lower(c."Order") = 'ephemeroptera' THEN 6
> >             WHEN lower(c."Order") = 'plecoptera' THEN 7
> >             WHEN lower(c."Class") = 'oligochaeta' THEN 1
> >             WHEN lower(c."Family") LIKE 'chiron%' THEN 2
> >             WHEN lower(c."Order") = 'diptera' AND lower(c."Family") NOT 
> > LIKE 'chiron%' THEN 3
> >             ELSE 4
> >     END AS "Ranking",
> >     CASE
> >             WHEN lower(c."Order") = 'coleoptera' THEN 'Coleoptera'
> >             WHEN lower(c."Order") = 'trichoptera' THEN 'Trichoptera'
> >             WHEN lower(c."Order") = 'ephemeroptera' THEN 'Ephemeroptera'
> >             WHEN lower(c."Order") = 'plecoptera' THEN 'Plecoptera'
> >             WHEN lower(c."Class") = 'oligochaeta' THEN 'Oligochaeta'
> >             WHEN lower(c."Family") LIKE 'chiron%' THEN 'Chironomidae'
> >             WHEN lower(c."Order") = 'diptera' AND lower(c."Family") NOT 
> > LIKE 'chiron%' THEN 'Diptera (Other)'
> >             ELSE 'Other'
> >     END AS "Taxa",
> >     SUM(CASE WHEN b."LocationCode" = '2222011' THEN c."Count" END) AS 
> > "2222011",
> >     SUM(CASE WHEN b."LocationCode" = '2222012' THEN c."Count" END) AS 
> > "2222012",
> >     SUM(CASE WHEN b."LocationCode" = '2222013' THEN c."Count" END) AS 
> > "2222013",
> >     SUM(CASE WHEN b."LocationCode" = '2222014' THEN c."Count" END) AS 
> > "2222014"
> > FROM "tblBugIDSheetInfo" b
> >     INNER JOIN "tblBugCount" c USING ("BugSheetID")
> > GROUP BY c."Order", c."Class", c."Family"
> > ORDER BY "Ranking"
> > 
> > Generates the following output:
> > 
> > Ranking     Taxa                    2222011 2222012 2222013 2222014
> > ---------------------------------------------------------------
> > 1   "Oligochaeta"           487     1711    1759    1078
> > 1   "Oligochaeta"           7                       
> > 1   "Oligochaeta"                                   
> > 2   "Chironomidae"          1385    2335    1500    1513
> > 2   "Chironomidae"                                  
> > 3   "Diptera (Other)"                               5
> > 3   "Diptera (Other)"               1       1       3
> > 3   "Diptera (Other)"       199     19      40      37
> > 3   "Diptera (Other)"                               
> > ...
> 
> I should add that I've just tried the following query, which is just a
> rearrangement of the above query using a sub-SELECT, and the result is
> the same as above apart from a slightly different ordering of the rows.
> That is, GROUP BY "Ranking" still doesn't appear to do anything. I was
> under the impression that a sub-SELECT creates a temporary, or at least
> a pseudo-temporary table (a "virtual" table as it's called in the
> documentation), so I would have expected this query to work even if the
> above one doesn't.
> 
> 
> SELECT DISTINCT
>       "Ranking", "Taxa",
>       SUM(CASE WHEN b."LocationCode" = '2222011' THEN c."Count" END) AS 
> "2222011",
>       SUM(CASE WHEN b."LocationCode" = '2222012' THEN c."Count" END) AS 
> "2222012",
>       SUM(CASE WHEN b."LocationCode" = '2222013' THEN c."Count" END) AS 
> "2222013",
>       SUM(CASE WHEN b."LocationCode" = '2222014' THEN c."Count" END) AS 
> "2222014"
> FROM "tblBugIDSheetInfo" b
>       INNER JOIN (
>               SELECT "BugSheetID",
>               CASE
>                       WHEN lower("Order") = 'coleoptera' THEN 5
>                       WHEN lower("Order") = 'trichoptera' THEN 8
>                       WHEN lower("Order") = 'ephemeroptera' THEN 6
>                       WHEN lower("Order") = 'plecoptera' THEN 7
>                       WHEN lower("Class") = 'oligochaeta' THEN 1
>                       WHEN lower("Family") LIKE 'chiron%' THEN 2
>                       WHEN lower("Order") = 'diptera' AND lower("Family") NOT 
> LIKE 'chiron%' THEN 3
>                       ELSE 4
>               END AS "Ranking",
>               CASE
>                       WHEN lower("Order") = 'coleoptera' THEN 'Coleoptera'
>                       WHEN lower("Order") = 'trichoptera' THEN 'Trichoptera'
>                       WHEN lower("Order") = 'ephemeroptera' THEN 
> 'Ephemeroptera'
>                       WHEN lower("Order") = 'plecoptera' THEN 'Plecoptera'
>                       WHEN lower("Class") = 'oligochaeta' THEN 'Oligochaeta'
>                       WHEN lower("Family") LIKE 'chiron%' THEN 'Chironomidae'
>                       WHEN lower("Order") = 'diptera' AND lower("Family") NOT 
> LIKE 'chiron%' THEN 'Diptera (Other)'
>                       ELSE 'Other'
>               END AS "Taxa",
>               "Order", "Class", "Family", "Count"
>               FROM "tblBugCount") c USING ("BugSheetID")
> GROUP BY c."Ranking", c."Taxa", c."Order", c."Class", c."Family"
> ORDER BY "Ranking"
> 

The problem was I was trying to do too much in one GROUP BY clause.
After getting the right result using a temporary table I backtracked and
found that the following version gives me the result I'm looking for.

SELECT DISTINCT
        "Ranking", "Taxa",
        SUM(CASE WHEN b."LocationCode" = '2222011' THEN c."Count" END) AS 
"2222011",
        SUM(CASE WHEN b."LocationCode" = '2222012' THEN c."Count" END) AS 
"2222012",
        SUM(CASE WHEN b."LocationCode" = '2222013' THEN c."Count" END) AS 
"2222013",
        SUM(CASE WHEN b."LocationCode" = '2222014' THEN c."Count" END) AS 
"2222014"
FROM "tblBugIDSheetInfo" b
        INNER JOIN (
                SELECT "BugSheetID",
                CASE
                        WHEN lower("Order") = 'coleoptera' THEN 5
                        WHEN lower("Order") = 'trichoptera' THEN 8
                        WHEN lower("Order") = 'ephemeroptera' THEN 6
                        WHEN lower("Order") = 'plecoptera' THEN 7
                        WHEN lower("Class") = 'oligochaeta' THEN 1
                        WHEN lower("Family") LIKE 'chiron%' THEN 2
                        WHEN lower("Order") = 'diptera' AND lower("Family") NOT 
LIKE 'chiron%' THEN 3
                        ELSE 4
                END AS "Ranking",
                CASE
                        WHEN lower("Order") = 'coleoptera' THEN 'Coleoptera'
                        WHEN lower("Order") = 'trichoptera' THEN 'Trichoptera'
                        WHEN lower("Order") = 'ephemeroptera' THEN 
'Ephemeroptera'
                        WHEN lower("Order") = 'plecoptera' THEN 'Plecoptera'
                        WHEN lower("Class") = 'oligochaeta' THEN 'Oligochaeta'
                        WHEN lower("Family") LIKE 'chiron%' THEN 'Chironomidae'
                        WHEN lower("Order") = 'diptera' AND lower("Family") NOT 
LIKE 'chiron%' THEN 'Diptera (Other)'
                        ELSE 'Other'
                END AS "Taxa",
                "Order", "Class", "Family", SUM("Count") AS "Count"
                FROM "tblBugCount"
                GROUP BY "BugSheetID", "Order", "Class", "Family") c USING 
("BugSheetID")
GROUP BY "Ranking", "Taxa"
ORDER BY "Ranking"

And modifying the original query gives the faster performance:

SELECT  "Ranking",
        "Taxa",
        SUM("2222011") AS "2222011",
        SUM("2222012") AS "2222012",
        SUM("2222013") AS "2222013",
        SUM("2222014") AS "2222014"
FROM (
        SELECT DISTINCT
        CASE
                WHEN lower(c."Order") = 'coleoptera' THEN 5
                WHEN lower(c."Order") = 'trichoptera' THEN 8
                WHEN lower(c."Order") = 'ephemeroptera' THEN 6
                WHEN lower(c."Order") = 'plecoptera' THEN 7
                WHEN lower(c."Class") = 'oligochaeta' THEN 1
                WHEN lower(c."Family") LIKE 'chiron%' THEN 2
                WHEN lower(c."Order") = 'diptera' AND lower(c."Family") NOT 
LIKE 'chiron%' THEN 3
                ELSE 4
        END AS "Ranking",
        CASE
                WHEN lower(c."Order") = 'coleoptera' THEN 'Coleoptera'
                WHEN lower(c."Order") = 'trichoptera' THEN 'Trichoptera'
                WHEN lower(c."Order") = 'ephemeroptera' THEN 'Ephemeroptera'
                WHEN lower(c."Order") = 'plecoptera' THEN 'Plecoptera'
                WHEN lower(c."Class") = 'oligochaeta' THEN 'Oligochaeta'
                WHEN lower(c."Family") LIKE 'chiron%' THEN 'Chironomidae'
                WHEN lower(c."Order") = 'diptera' AND lower(c."Family") NOT 
LIKE 'chiron%' THEN 'Diptera (Other)'
                ELSE 'Other'
        END AS "Taxa",
        SUM(CASE WHEN b."LocationCode" = '2222011' THEN c."Count" END) AS 
"2222011",
        SUM(CASE WHEN b."LocationCode" = '2222012' THEN c."Count" END) AS 
"2222012",
        SUM(CASE WHEN b."LocationCode" = '2222013' THEN c."Count" END) AS 
"2222013",
        SUM(CASE WHEN b."LocationCode" = '2222014' THEN c."Count" END) AS 
"2222014"
        FROM "tblBugIDSheetInfo" b
                INNER JOIN "tblBugCount" c USING ("BugSheetID")
        WHERE b."LocationCode" = '2222011'
           OR b."LocationCode" = '2222012'
           OR b."LocationCode" = '2222013'
           OR b."LocationCode" = '2222014'
        GROUP BY c."Order", c."Class", c."Family") d
GROUP BY "Ranking", "Taxa"
ORDER BY "Ranking"


Both these queries give me the result I was after:

> > I want to GROUP on the "Ranking" field as well so that all rows with the
> > same "Ranking" value are SUMmed. That is, I'm trying to achieve the
> > following:
> > 
> > Ranking     Taxa                    2222011 2222012 2222013 2222014
> > ---------------------------------------------------------------
> > 1   "Oligochaeta"           494     1711    1759    1078
> > 2   "Chironomidae"          1385    2335    1500    1513
> > 3   "Diptera (Other)"       199     20      41      45
> > ...

Applying the grouping at the appropriate points helps :)

Regards.
Ross Johnson



---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

Reply via email to