[jira] Commented: (DERBY-781) Materialize union subqueries in select list where possible to avoid creating invariant resultsets many times.
[ http://issues.apache.org/jira/browse/DERBY-781?page=comments#action_12422753 ] Satheesh Bandaram commented on DERBY-781: - I think it would be good to modify this improvement description, as it will likely be picked up by release notes and/or other documentation. The fix is more generic than 'UNION' subqueries as the original description says. Also the example in the description doesn't apply anymore, I think. When the entry was made, join-predicate push down work wasn't completed, so the example in the description would have shown the problem, I think. But now, (post join-predicate pushdown work) the example may not apply. Materialize union subqueries in select list where possible to avoid creating invariant resultsets many times. - Key: DERBY-781 URL: http://issues.apache.org/jira/browse/DERBY-781 Project: Derby Issue Type: Improvement Components: SQL Affects Versions: 10.1.1.0, 10.2.0.0 Environment: generic Reporter: Satheesh Bandaram Assigned To: A B Attachments: d781_v1.patch, d781_v1.stat, d781_v2.patch, DERBY-781_v1.html Derby's handling of union subqueries in from list can be improved by materializing invariant resultsets once, rather than creating them many times. For example: create view V1 as select i, j from T1 union select i,j from T2; create view V2 as select a,b from T3 union select a,b from T4; insert into T1 values (1,1), (2,2), (3,3), (4,4), (5,5); For a query like select * from V1, V2 where V1.j = V2.b and V1.i in (1,2,3,4,5), it is possible the resultset for V2 is created 5 times. (assuming V2 is choosen as the the inner table) This can be very costly if the underlying selects can take long time and also may perform union many times. Enhance materialization logic in setOperatorNode.java. It currently returns FALSE always. public boolean performMaterialization(JBitSet outerTables) throws StandardException { // RESOLVE - just say no to materialization right now - should be a cost based decision return false; /* Actual materialization, if appropriate, will be placed by our parent PRN. * This is because PRN might have a join condition to apply. (Materialization * can only occur before that. */ //return true; } -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (DERBY-781) Materialize union subqueries in select list where possible to avoid creating invariant resultsets many times.
[ http://issues.apache.org/jira/browse/DERBY-781?page=comments#action_12422768 ] A B commented on DERBY-781: --- Oh wait, check that last comment. That was only for the summary; I see now that you were talking about the actual description. Sorry. Materialize union subqueries in select list where possible to avoid creating invariant resultsets many times. - Key: DERBY-781 URL: http://issues.apache.org/jira/browse/DERBY-781 Project: Derby Issue Type: Improvement Components: SQL Affects Versions: 10.1.1.0, 10.2.0.0 Environment: generic Reporter: Satheesh Bandaram Assigned To: A B Attachments: d781_v1.patch, d781_v1.stat, d781_v2.patch, DERBY-781_v1.html Derby's handling of union subqueries in from list can be improved by materializing invariant resultsets once, rather than creating them many times. For example: create view V1 as select i, j from T1 union select i,j from T2; create view V2 as select a,b from T3 union select a,b from T4; insert into T1 values (1,1), (2,2), (3,3), (4,4), (5,5); For a query like select * from V1, V2 where V1.j = V2.b and V1.i in (1,2,3,4,5), it is possible the resultset for V2 is created 5 times. (assuming V2 is choosen as the the inner table) This can be very costly if the underlying selects can take long time and also may perform union many times. Enhance materialization logic in setOperatorNode.java. It currently returns FALSE always. public boolean performMaterialization(JBitSet outerTables) throws StandardException { // RESOLVE - just say no to materialization right now - should be a cost based decision return false; /* Actual materialization, if appropriate, will be placed by our parent PRN. * This is because PRN might have a join condition to apply. (Materialization * can only occur before that. */ //return true; } -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (DERBY-781) Materialize union subqueries in select list where possible to avoid creating invariant resultsets many times.
[ http://issues.apache.org/jira/browse/DERBY-781?page=comments#action_12422767 ] A B commented on DERBY-781: --- I think it would be good to modify this improvement description Seems like this could be a simple as removing the word union from the description--does that sound reasonable to you? Or do you want an entirely new description? Something like Subquery materialization via hash join or Support subquery materialization by allowing the optimizer to cost and generate hash joins with subqueries. Any preference? Materialize union subqueries in select list where possible to avoid creating invariant resultsets many times. - Key: DERBY-781 URL: http://issues.apache.org/jira/browse/DERBY-781 Project: Derby Issue Type: Improvement Components: SQL Affects Versions: 10.1.1.0, 10.2.0.0 Environment: generic Reporter: Satheesh Bandaram Assigned To: A B Attachments: d781_v1.patch, d781_v1.stat, d781_v2.patch, DERBY-781_v1.html Derby's handling of union subqueries in from list can be improved by materializing invariant resultsets once, rather than creating them many times. For example: create view V1 as select i, j from T1 union select i,j from T2; create view V2 as select a,b from T3 union select a,b from T4; insert into T1 values (1,1), (2,2), (3,3), (4,4), (5,5); For a query like select * from V1, V2 where V1.j = V2.b and V1.i in (1,2,3,4,5), it is possible the resultset for V2 is created 5 times. (assuming V2 is choosen as the the inner table) This can be very costly if the underlying selects can take long time and also may perform union many times. Enhance materialization logic in setOperatorNode.java. It currently returns FALSE always. public boolean performMaterialization(JBitSet outerTables) throws StandardException { // RESOLVE - just say no to materialization right now - should be a cost based decision return false; /* Actual materialization, if appropriate, will be placed by our parent PRN. * This is because PRN might have a join condition to apply. (Materialization * can only occur before that. */ //return true; } -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (DERBY-781) Materialize union subqueries in select list where possible to avoid creating invariant resultsets many times.
[ http://issues.apache.org/jira/browse/DERBY-781?page=comments#action_12421652 ] A B commented on DERBY-781: --- Thank you so much for volunteering to do this review, Bryan--and for taking the time to read the write-up in its rather wordy entirety. I really appreciate your time and effort here. I'll work on putting together a release note as you described on derby-dev: http://article.gmane.org/gmane.comp.apache.db.derby.devel/23875 and will post that to this issue and/or to DERBY-1357. Thanks again for all of your time, Bryan! Materialize union subqueries in select list where possible to avoid creating invariant resultsets many times. - Key: DERBY-781 URL: http://issues.apache.org/jira/browse/DERBY-781 Project: Derby Issue Type: Improvement Components: SQL Affects Versions: 10.1.1.0, 10.2.0.0 Environment: generic Reporter: Satheesh Bandaram Assigned To: A B Attachments: d781_v1.patch, d781_v1.stat, d781_v2.patch, DERBY-781_v1.html Derby's handling of union subqueries in from list can be improved by materializing invariant resultsets once, rather than creating them many times. For example: create view V1 as select i, j from T1 union select i,j from T2; create view V2 as select a,b from T3 union select a,b from T4; insert into T1 values (1,1), (2,2), (3,3), (4,4), (5,5); For a query like select * from V1, V2 where V1.j = V2.b and V1.i in (1,2,3,4,5), it is possible the resultset for V2 is created 5 times. (assuming V2 is choosen as the the inner table) This can be very costly if the underlying selects can take long time and also may perform union many times. Enhance materialization logic in setOperatorNode.java. It currently returns FALSE always. public boolean performMaterialization(JBitSet outerTables) throws StandardException { // RESOLVE - just say no to materialization right now - should be a cost based decision return false; /* Actual materialization, if appropriate, will be placed by our parent PRN. * This is because PRN might have a join condition to apply. (Materialization * can only occur before that. */ //return true; } -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (DERBY-781) Materialize union subqueries in select list where possible to avoid creating invariant resultsets many times.
[ http://issues.apache.org/jira/browse/DERBY-781?page=comments#action_12421314 ] Bryan Pendleton commented on DERBY-781: --- Hi Army, Thanks again for the great writeup, and for putting the energy into clear comments and careful changes to the code. It really makes a huge difference when trying to read the code. I've read through the writeup carefully, and checked it against the patch, and I have no comments or suggestions to make. The patch applied cleanly for me, built without problems, and lang/subquery.sql and lang/predicatesIntoviews.sql both passed in my environment. It looks like an excellent patch to me; I am +1 for commit. Materialize union subqueries in select list where possible to avoid creating invariant resultsets many times. - Key: DERBY-781 URL: http://issues.apache.org/jira/browse/DERBY-781 Project: Derby Issue Type: Improvement Components: SQL Affects Versions: 10.1.1.0, 10.2.0.0 Environment: generic Reporter: Satheesh Bandaram Assigned To: A B Attachments: d781_v1.patch, d781_v1.stat, d781_v2.patch, DERBY-781_v1.html Derby's handling of union subqueries in from list can be improved by materializing invariant resultsets once, rather than creating them many times. For example: create view V1 as select i, j from T1 union select i,j from T2; create view V2 as select a,b from T3 union select a,b from T4; insert into T1 values (1,1), (2,2), (3,3), (4,4), (5,5); For a query like select * from V1, V2 where V1.j = V2.b and V1.i in (1,2,3,4,5), it is possible the resultset for V2 is created 5 times. (assuming V2 is choosen as the the inner table) This can be very costly if the underlying selects can take long time and also may perform union many times. Enhance materialization logic in setOperatorNode.java. It currently returns FALSE always. public boolean performMaterialization(JBitSet outerTables) throws StandardException { // RESOLVE - just say no to materialization right now - should be a cost based decision return false; /* Actual materialization, if appropriate, will be placed by our parent PRN. * This is because PRN might have a join condition to apply. (Materialization * can only occur before that. */ //return true; } -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (DERBY-781) Materialize union subqueries in select list where possible to avoid creating invariant resultsets many times.
[ http://issues.apache.org/jira/browse/DERBY-781?page=comments#action_12361784 ] Satheesh Bandaram commented on DERBY-781: - Thanks Jeff for your analysis. I was just getting ready to file another improvement request to make this optimization more generic. (not specific to unions) I have seen huge improvements in two different customer situations. For the situation I filed the defect, each of the views (V1 and V2) had 36 tables each and by materializing the inner view into a temp. table, I noticed speed up from 70-150 seconds to under 3 seconds. (including the cost of creating temp. table) I also saw another situation later without unions where materializing some table subqueries improved performance by couple of orders of magnitude. So you are right... this optimization can be applied to other cases too. I think materialization with or without hash joins should be useful. In both situations, creating temp. table that materialized derived tables improved so much. Materialize union subqueries in select list where possible to avoid creating invariant resultsets many times. - Key: DERBY-781 URL: http://issues.apache.org/jira/browse/DERBY-781 Project: Derby Type: Improvement Components: SQL Versions: 10.1.1.0, 10.2.0.0 Environment: generic Reporter: Satheesh Bandaram Derby's handling of union subqueries in from list can be improved by materializing invariant resultsets once, rather than creating them many times. For example: create view V1 as select i, j from T1 union select i,j from T2; create view V2 as select a,b from T3 union select a,b from T4; insert into T1 values (1,1), (2,2), (3,3), (4,4), (5,5); For a query like select * from V1, V2 where V1.j = V2.b and V1.i in (1,2,3,4,5), it is possible the resultset for V2 is created 5 times. (assuming V2 is choosen as the the inner table) This can be very costly if the underlying selects can take long time and also may perform union many times. Enhance materialization logic in setOperatorNode.java. It currently returns FALSE always. public boolean performMaterialization(JBitSet outerTables) throws StandardException { // RESOLVE - just say no to materialization right now - should be a cost based decision return false; /* Actual materialization, if appropriate, will be placed by our parent PRN. * This is because PRN might have a join condition to apply. (Materialization * can only occur before that. */ //return true; } -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (DERBY-781) Materialize union subqueries in select list where possible to avoid creating invariant resultsets many times.
[ http://issues.apache.org/jira/browse/DERBY-781?page=comments#action_12361785 ] Satheesh Bandaram commented on DERBY-781: - Thanks Jeff for your analysis. I was just getting ready to file another improvement request to make this optimization more generic. (not specific to unions) I have seen huge improvements in two different customer situations. For the situation I filed the defect, each of the views (V1 and V2) had 36 tables each and by materializing the inner view into a temp. table, I noticed speed up from 70-150 seconds to under 3 seconds. (including the cost of creating temp. table) I also saw another situation later without unions where materializing some table subqueries improved performance by couple of orders of magnitude. So you are right... this optimization can be applied to other cases too. I think materialization with or without hash joins should be useful. In both situations, creating temp. table that materialized derived tables improved so much. You are right that the optimization should be done inside the optimizer. Materialize union subqueries in select list where possible to avoid creating invariant resultsets many times. - Key: DERBY-781 URL: http://issues.apache.org/jira/browse/DERBY-781 Project: Derby Type: Improvement Components: SQL Versions: 10.1.1.0, 10.2.0.0 Environment: generic Reporter: Satheesh Bandaram Derby's handling of union subqueries in from list can be improved by materializing invariant resultsets once, rather than creating them many times. For example: create view V1 as select i, j from T1 union select i,j from T2; create view V2 as select a,b from T3 union select a,b from T4; insert into T1 values (1,1), (2,2), (3,3), (4,4), (5,5); For a query like select * from V1, V2 where V1.j = V2.b and V1.i in (1,2,3,4,5), it is possible the resultset for V2 is created 5 times. (assuming V2 is choosen as the the inner table) This can be very costly if the underlying selects can take long time and also may perform union many times. Enhance materialization logic in setOperatorNode.java. It currently returns FALSE always. public boolean performMaterialization(JBitSet outerTables) throws StandardException { // RESOLVE - just say no to materialization right now - should be a cost based decision return false; /* Actual materialization, if appropriate, will be placed by our parent PRN. * This is because PRN might have a join condition to apply. (Materialization * can only occur before that. */ //return true; } -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira