Hi,

I have created a standalone test case to demonstrate my issue.

There are 2 tablescans in the query, there should only be one.

(The row counts are small so arguably the optimisation is ok here, but it
does it when the row counts are a lot higher)

regards,

Mike


DROP ALL OBJECTS;

CREATE TABLE A (
    id BIGINT PRIMARY KEY,
    name VARCHAR
);

CREATE TABLE B (
    id BIGINT PRIMARY KEY,
    name VARCHAR,
    a BIGINT
);

CREATE TABLE C (
    id BIGINT PRIMARY KEY,
    name VARCHAR
);

ALTER TABLE C ADD CONSTRAINT C2B FOREIGN KEY(id) REFERENCES B(id);
ALTER TABLE B ADD CONSTRAINT B2A FOREIGN KEY(a) REFERENCES A(id);

INSERT INTO A(id,name) VALUES(0,'a0');
INSERT INTO A(id,name) VALUES(1,'a1');
INSERT INTO A(id,name) VALUES(2,'a2');
INSERT INTO A(id,name) VALUES(3,'a4');

INSERT INTO B(id,name,a) VALUES(0,'b0','0');
INSERT INTO B(id,name,a) VALUES(1,'b1','0');
INSERT INTO B(id,name,a) VALUES(2,'b2','0');
INSERT INTO B(id,name,a) VALUES(3,'b3','0');
INSERT INTO B(id,name,a) VALUES(4,'b4','0');
INSERT INTO B(id,name,a) VALUES(5,'b5','1');
INSERT INTO B(id,name,a) VALUES(6,'b6','1');
INSERT INTO B(id,name,a) VALUES(7,'b7','1');
INSERT INTO B(id,name,a) VALUES(8,'b8','2');
INSERT INTO B(id,name,a) VALUES(9,'b9','2');


INSERT INTO C(id,name) VALUES(0,'c0');



ANALYZE;


EXPLAIN ANALYZE SELECT
   a.name as ANAME,
   aggr.bs,
   aggr.cs
FROM a LEFT JOIN (
     SELECT
         group_concat(b.name) as bs,
         group_concat(c.name) as cs,
         b.a as groupby
     FROM b INNER JOIN c ON b.id=c.id
     GROUP BY groupby
) as aggr ON a.id=aggr.groupby;














On Mon, Mar 9, 2015 at 1:24 PM, Mike Goodwin <[email protected]> wrote:

> Hi,
>
> I have a query performance problem that has come, gone away and come back
> again.
>
> The issue seems to be when using what are sometimes referred to as sub
> types. A sub type is a table that share a primary key with the super type
> table. This explains the scheme a bit better.
>
>
> http://blogs.msdn.com/b/dfurman/archive/2009/08/31/disjoint-subtyping-in-sql.aspx
>
> I have a situation where I have a query that essentially involves a sub
> type (B/enquiry line) and the super type (A/part) table. A 3rd table
> (C/enquiry) is referenced by the sub type directly and there is a foreign
> key/index between them. I want to aggregate over B/A and group by C.The
> problem seems to be that the optimiser has chosen to use A as the main
> table in the group by sub query and then because of this choice it does not
> have access to the query when joining to C and so it becomes very slow
> because of the nested table scan.
>
> I appreciate that probably reads like nonsense and may not be accurate. I
> can provide the SCRIPT TO schema/test data, if anyone would like to take a
> closer look. The query is and the EXPLAIN is at the bottom of this email.
>
>
> I should say I really appreciate h2 and the performance it gives us (it's
> a great little database).
>
> <minor SQL rant> Sometimes I find SQL to be a bit frustrating, because
> when doing what I view as hierarchical operations with what should be a
> clear execution plan, when they are expressed in SQL this clarity on how
> they should be executed is lost because of its generality. 99% of the SQL I
> write (or generate) usually amounts to hierarchical operations and this
> generality increases the risk of performance issues - I think this maybe
> true for many database users. Take In this example, some enquiries, some
> lines, some parts which correspond to lines. There may be 1000s of
> enquiries and 10000s of lines/parts. However each enquiry would only ever
> have a few lines (probably <10 on average). The lines themselves are never
> shared between enquires. An operation aggregating over the lines (e.g.
> counting them, or summing their values) should not ever need to bigger than
> O(lines) and I should be able to express it in away which this is clear and
> without risk. It seems to be me (having used other database, e.g. oracle)
> that the promise of optimisers removing this risk have never fully
> delivered. I would dearly like to have a way to express hiercarchical
> queries and  bypass sql!! (I'd settle for a way to optionally prevent the
> query optimiser from reordering joins).
>
> regards,
>
> Mike
>
> (on a related note, I think that it would be really good if the SCRIPT
> function could take a query and then output just the necessary statements
> to recreate enough of the database to run the query, it would make
> producing isolated test cases many many times easier ...)
>
> ----------------
>
>
>
> EXPLAIN SELECT
>         IFNULL(t_lines_filter0.vc0,0) AS value
>     FROM "enquiry" AS t
>     LEFT JOIN (
>         SELECT
>             IFNULL(coalesce(sum(CASE WHEN
>                 t_lines_filter0."include"
>             THEN
>                 CASE WHEN t_lines_filter0_part_sub_job_costing.revision IS
> NULL
>                 THEN NULL
>                 ELSE
> IFNULL(t_lines_filter0_part_sub_job_costing_internal.vc0,0)
>                 END
>             ELSE
>                 0
>             END),0),0) AS vc0,
>             t_lines_filter0."enquiry" AS GROUPBY
>         FROM "enquiry_line" AS t_lines_filter0
>         INNER JOIN "part_sub" AS t_lines_filter0_part_sub ON
> t_lines_filter0."Id" IS t_lines_filter0_part_sub."Id"
>         INNER JOIN "job" AS t_lines_filter0_part_sub_job ON
> t_lines_filter0_part_sub."Id" IS t_lines_filter0_part_sub_job."Id"
>         LEFT JOIN "costing" AS t_lines_filter0_part_sub_job_costing ON
> t_lines_filter0_part_sub_job."costing" IS
> t_lines_filter0_part_sub_job_costing."Id"
>         LEFT JOIN (
>             SELECT
>
> IFNULL(coalesce(sum(t_lines_filter0_part_sub_job_costing_internal."total_time"),0),0)
> AS vc0,
>                 t_lines_filter0_part_sub_job_costing_internal."costing" AS
> GROUPBY
>             FROM "costing_internal" AS
> t_lines_filter0_part_sub_job_costing_internal
>             WHERE (true)
>             GROUP BY groupby
>         ) AS t_lines_filter0_part_sub_job_costing_internal ON
> t_lines_filter0_part_sub_job_costing."Id" IS
> t_lines_filter0_part_sub_job_costing_internal.groupby
>         WHERE (t_lines_filter0."include")
>         GROUP BY groupby
>     ) AS t_lines_filter0 ON t."Id" IS t_lines_filter0.groupby ;
>
>
> ----------
>
>
> SELECT
>     IFNULL(T_LINES_FILTER0.VC0, 0) AS VALUE
> FROM "enquiry" T
>     /* "enquiry".tableScan */
> LEFT OUTER JOIN (
>     SELECT
>         IFNULL(COALESCE(SUM(CASE WHEN T_LINES_FILTER0."include" THEN CASE
> WHEN (T_LINES_FILTER0_PART_SUB_JOB_COSTING.REVISION IS NULL) THEN NULL ELSE
> IFNULL(T_LINES_FILTER0_PART_SUB_JOB_COSTING_INTERNAL.VC0, 0) END ELSE 0
> END), 0), 0) AS VC0,
>         T_LINES_FILTER0."enquiry" AS GROUPBY
>     FROM "part_sub" T_LINES_FILTER0_PART_SUB
>         /* "part_sub".tableScan */
>     INNER JOIN "job" T_LINES_FILTER0_PART_SUB_JOB
>         /* PRIMARY_KEY_1A: "Id" IS T_LINES_FILTER0_PART_SUB."Id" */
>         ON 1=1
>         /* WHERE T_LINES_FILTER0_PART_SUB."Id" IS
> T_LINES_FILTER0_PART_SUB_JOB."Id"
>         */
>     LEFT OUTER JOIN "costing" T_LINES_FILTER0_PART_SUB_JOB_COSTING
>         /* PRIMARY_KEY_C03: "Id" IS T_LINES_FILTER0_PART_SUB_JOB."costing"
> */
>         ON T_LINES_FILTER0_PART_SUB_JOB."costing" IS
> T_LINES_FILTER0_PART_SUB_JOB_COSTING."Id"
>     LEFT OUTER JOIN (
>         SELECT
>
> IFNULL(COALESCE(SUM(T_LINES_FILTER0_PART_SUB_JOB_COSTING_INTERNAL."total_time"),
> 0), 0) AS VC0,
>             T_LINES_FILTER0_PART_SUB_JOB_COSTING_INTERNAL."costing" AS
> GROUPBY
>         FROM "costing_internal"
> T_LINES_FILTER0_PART_SUB_JOB_COSTING_INTERNAL
>             /* FK_COSTING_INTERNAL_COSTING_INDEX_F */
>         WHERE TRUE
>         GROUP BY T_LINES_FILTER0_PART_SUB_JOB_COSTING_INTERNAL."costing"
>         /* group sorted */
>     ) T_LINES_FILTER0_PART_SUB_JOB_COSTING_INTERNAL
>         /* SELECT
>
> IFNULL(COALESCE(SUM(T_LINES_FILTER0_PART_SUB_JOB_COSTING_INTERNAL."total_time"),
> 0), 0) AS VC0,
>             T_LINES_FILTER0_PART_SUB_JOB_COSTING_INTERNAL."costing" AS
> GROUPBY
>         FROM "costing_internal"
> T_LINES_FILTER0_PART_SUB_JOB_COSTING_INTERNAL
>             /++ FK_COSTING_INTERNAL_COSTING_INDEX_F: "costing" IS ?1 ++/
>         WHERE T_LINES_FILTER0_PART_SUB_JOB_COSTING_INTERNAL."costing" IS ?1
>         GROUP BY T_LINES_FILTER0_PART_SUB_JOB_COSTING_INTERNAL."costing"
>         /++ group sorted ++/: GROUPBY IS
> T_LINES_FILTER0_PART_SUB_JOB_COSTING."Id"
>          */
>         ON T_LINES_FILTER0_PART_SUB_JOB_COSTING."Id" IS
> T_LINES_FILTER0_PART_SUB_JOB_COSTING_INTERNAL.GROUPBY
>     INNER JOIN "enquiry_line" T_LINES_FILTER0
>         /* PRIMARY_KEY_B95: "Id" IS T_LINES_FILTER0_PART_SUB."Id" */
>         ON 1=1
>     WHERE T_LINES_FILTER0."include"
>         AND ((T_LINES_FILTER0_PART_SUB."Id" IS
> T_LINES_FILTER0_PART_SUB_JOB."Id")
>         AND (T_LINES_FILTER0."Id" IS T_LINES_FILTER0_PART_SUB."Id"))
>     GROUP BY T_LINES_FILTER0."enquiry"
> ) T_LINES_FILTER0
>     /* SELECT
>         IFNULL(COALESCE(SUM(CASE WHEN T_LINES_FILTER0."include" THEN CASE
> WHEN (T_LINES_FILTER0_PART_SUB_JOB_COSTING.REVISION IS NULL) THEN NULL ELSE
> IFNULL(T_LINES_FILTER0_PART_SUB_JOB_COSTING_INTERNAL.VC0, 0) END ELSE 0
> END), 0), 0) AS VC0,
>         T_LINES_FILTER0."enquiry" AS GROUPBY
>     FROM "part_sub" T_LINES_FILTER0_PART_SUB
>         /++ "part_sub".tableScan ++/
>     INNER JOIN "job" T_LINES_FILTER0_PART_SUB_JOB
>         /++ PRIMARY_KEY_1A: "Id" IS T_LINES_FILTER0_PART_SUB."Id" ++/
>         ON 1=1
>         /++ WHERE T_LINES_FILTER0_PART_SUB."Id" IS
> T_LINES_FILTER0_PART_SUB_JOB."Id"
>         ++/
>     LEFT OUTER JOIN "costing" T_LINES_FILTER0_PART_SUB_JOB_COSTING
>         /++ PRIMARY_KEY_C03: "Id" IS
> T_LINES_FILTER0_PART_SUB_JOB."costing" ++/
>         ON T_LINES_FILTER0_PART_SUB_JOB."costing" IS
> T_LINES_FILTER0_PART_SUB_JOB_COSTING."Id"
>     LEFT OUTER JOIN (
>         SELECT
>
> IFNULL(COALESCE(SUM(T_LINES_FILTER0_PART_SUB_JOB_COSTING_INTERNAL."total_time"),
> 0), 0) AS VC0,
>             T_LINES_FILTER0_PART_SUB_JOB_COSTING_INTERNAL."costing" AS
> GROUPBY
>         FROM "costing_internal"
> T_LINES_FILTER0_PART_SUB_JOB_COSTING_INTERNAL
>             /++ FK_COSTING_INTERNAL_COSTING_INDEX_F ++/
>         WHERE TRUE
>         GROUP BY T_LINES_FILTER0_PART_SUB_JOB_COSTING_INTERNAL."costing"
>         /++ group sorted ++/
>     ) T_LINES_FILTER0_PART_SUB_JOB_COSTING_INTERNAL
>         /++ SELECT
>
> IFNULL(COALESCE(SUM(T_LINES_FILTER0_PART_SUB_JOB_COSTING_INTERNAL."total_time"),
> 0), 0) AS VC0,
>             T_LINES_FILTER0_PART_SUB_JOB_COSTING_INTERNAL."costing" AS
> GROUPBY
>         FROM "costing_internal"
> T_LINES_FILTER0_PART_SUB_JOB_COSTING_INTERNAL
>             /++ FK_COSTING_INTERNAL_COSTING_INDEX_F: "costing" IS ?2 ++/
>         WHERE T_LINES_FILTER0_PART_SUB_JOB_COSTING_INTERNAL."costing" IS ?2
>         GROUP BY T_LINES_FILTER0_PART_SUB_JOB_COSTING_INTERNAL."costing"
>         /++ group sorted ++/: GROUPBY IS
> T_LINES_FILTER0_PART_SUB_JOB_COSTING."Id"
>          ++/
>         ON T_LINES_FILTER0_PART_SUB_JOB_COSTING."Id" IS
> T_LINES_FILTER0_PART_SUB_JOB_COSTING_INTERNAL.GROUPBY
>     INNER JOIN "enquiry_line" T_LINES_FILTER0
>         /++ PRIMARY_KEY_B95: "Id" IS T_LINES_FILTER0_PART_SUB."Id" ++/
>         ON TRUE
>     WHERE (T_LINES_FILTER0."enquiry" IS ?1)
>         AND (T_LINES_FILTER0."include"
>         AND ((T_LINES_FILTER0_PART_SUB."Id" IS
> T_LINES_FILTER0_PART_SUB_JOB."Id")
>         AND (T_LINES_FILTER0."Id" IS T_LINES_FILTER0_PART_SUB."Id")))
>     GROUP BY T_LINES_FILTER0."enquiry": GROUPBY IS T."Id"
>      */
>     ON T."Id" IS T_LINES_FILTER0.GROUPBY
>
>
>
>
>

-- 
You received this message because you are subscribed to the Google Groups "H2 
Database" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/h2-database.
For more options, visit https://groups.google.com/d/optout.

Reply via email to