Re: Implementing Incremental View Maintenance
On Fri, Sep 09, 2022 at 08:10:32PM +0900, Yugo NAGATA wrote: > I am working on above issues (#1-#4) now, and I'll respond on each later. Okay, well. There has been some feedback sent lately and no update for one month, so I am marking it as RwF for now. As a whole the patch has been around for three years and it does not seem that a lot has happened in terms of design discussion (now the thread is long so I would easily miss something).. -- Michael signature.asc Description: PGP signature
Re: Implementing Incremental View Maintenance
Hello huyajun, I'm sorry for delay in my response. On Tue, 26 Jul 2022 12:00:26 +0800 huyajun wrote: > I read your patch and think this processing is greet, but there is a risk of > deadlock. > Although I have not thought of a suitable processing method for the time > being, > it is also acceptable for truncate scenarios.The deadlock scene is as follows: > > Mv define is: select * from base_a,base_b; > S1: truncate base_a; ― only AccessExclusiveLock base_a and not run into after > trigger > S2: insert into base_b; ― The update has been completed and the incremental > refresh is started in the after trigger,RowExclusive on base_b and > ExclusiveLock on mv > S1: continue truncate mv, wait for AccessExclusiveLock on mv, wait for S2 > S2: continue refresh mv, wait for AccessShardLock on base_a, wait for S1 > So deadlock occurred Hmm, this deadlock scenario is possible, indeed. One idea to resolve it is to acquire RowExclusive locks on all base tables in the BEFORE trigger. If so, S2 can not progress its process because it waits for a RowExclusive lock on base_b, and it can not acquire ExeclusiveLock on mv before S1 finishes. > I also found some new issues that I would like to discuss with you Thank you so much for your massive bug reports! > 1. Concurrent DML causes imv data error, case like below > Setup: > Create table t( a int); > Insert into t select 1 from generate_series(1,3); > create incremental materialized view s as select count(*) from t; > > S1: begin;delete from t where ctid in (select ctid from t limit 1); > S2: begin;delete from t where ctid in (select ctid from t limit 1 offset 1); > S1: commit; > S2: commit; > > After this, The count data of s becomes 2 but correct data is 1. > I found out that the problem is probably because to our use of ctid update > Consider user behavior unrelated to imv: > > Create table t( a int); > Insert into t select 1; > s1: BEGIN > s1: update t set a = 2 where ctid in (select ctid from t); -- UPDATE 1 > s2: BEGIN > s2: update t set a = 3 where ctid in (select ctid from t); -- wait row lock > s1: COMMIT > s2: -- UPDATE 0 -- ctid change so can't UPDATE one rows > So we lost the s2 update > > 2. Sometimes it will crash when the columns of the created materialized view > do not match > Create table t( a int); > create incremental materialized view s(z) as select sum(1) as a, sum(1) as b > from t; > > The problem should be that colNames in rewriteQueryForIMMV does not consider > this situation > > 3. Sometimes no error when the columns of the created materialized view do > not match > Create table t( a int); > create incremental materialized view s(y,z) as select count(1) as b from t; > > But the hidden column of IMV is overwritten to z which will cause refresh > failed. > > The problem should be that checkRuleResultList we should only skip imv hidden > columns check > > 4. A unique index should not be created in the case of a Cartesian product > > create table base_a (i int primary key, j varchar); > create table base_b (i int primary key, k varchar); > INSERT INTO base_a VALUES > (1,10), > (2,20), > (3,30), > (4,40), > (5,50); > INSERT INTO base_b VALUES > (1,101), > (2,102), > (3,103), > (4,104); > CREATE incremental MATERIALIZED VIEW s as > select base_a.i,base_a.j from base_a,base_b; ― create error because of unique > index I am working on above issues (#1-#4) now, and I'll respond on each later. > 5. Besides, I would like to ask you if you have considered implementing an > IMV with delayed refresh? > The advantage of delayed refresh is that it will not have much impact on > write performance Yes, I've been thinking to implement deferred maintenance since the beginning of this IVM project. However, we've decided to start from immediate maintenance, and will plan to propose deferred maintenance to the core after the current patch is accepted. (I plan to implement this feature in pg_ivm extension module first, though.) > I probably have some ideas about it now, do you think it works? > 1. After the base table is updated, the delayed IMV's after trigger is used > to record the delta > information in another table similar to the incremental log of the base table > 2. When incremental refresh, use the data in the log instead of the data in > the trasient table > of the after trigger > 3. We need to merge the incremental information in advance to ensure that the > base_table > after transaction filtering UNION ALL old_delta is the state before the base > table is updated > Case like below: > Create table t( a int); > ―begin to record log > Insert into t select 1; ― newlog: 1 oldlog: empty > Delete from t; ―newlog:1, oldlog:1 > ― begin to incremental refresh > Select * from t where xmin < xid or (xmin = xid and cmin < cid); ― empty > So this union all oldlog is not equal to before the base table is updated > We need merge the incremental log in advance to make newlog: empty, oldlog: > empty > > If implemented,
Re: Implementing Incremental View Maintenance
Hi, Nagata-san Thank you for your answer, I agree with your opinion, and found some new problems to discuss with you > >> 3. Consider truncate base tables, IVM will not refresh, maybe raise an error >> will be better > > I fixed to support TRUNCATE on base tables in our repository. > https://github.com/sraoss/pgsql-ivm/commit/a1365ed69f34e1adbd160f2ce8fd1e80e032392f > > When a base table is truncated, the view content will be empty if the > view definition query does not contain an aggregate without a GROUP clause. > Therefore, such views can be truncated. > > Aggregate views without a GROUP clause always have one row. Therefore, > if a base table is truncated, the view will not be empty and will contain > a row with NULL value (or 0 for count()). So, in this case, we refresh the > view instead of truncating it. > > The next version of the patch-set will include this change. > I read your patch and think this processing is greet, but there is a risk of deadlock. Although I have not thought of a suitable processing method for the time being, it is also acceptable for truncate scenarios.The deadlock scene is as follows: Mv define is: select * from base_a,base_b; S1: truncate base_a; — only AccessExclusiveLock base_a and not run into after trigger S2: insert into base_b; — The update has been completed and the incremental refresh is started in the after trigger,RowExclusive on base_b and ExclusiveLock on mv S1: continue truncate mv, wait for AccessExclusiveLock on mv, wait for S2 S2: continue refresh mv, wait for AccessShardLock on base_a, wait for S1 So deadlock occurred I also found some new issues that I would like to discuss with you 1. Concurrent DML causes imv data error, case like below Setup: Create table t( a int); Insert into t select 1 from generate_series(1,3); create incremental materialized view s as select count(*) from t; S1: begin;delete from t where ctid in (select ctid from t limit 1); S2: begin;delete from t where ctid in (select ctid from t limit 1 offset 1); S1: commit; S2: commit; After this, The count data of s becomes 2 but correct data is 1. I found out that the problem is probably because to our use of ctid update Consider user behavior unrelated to imv: Create table t( a int); Insert into t select 1; s1: BEGIN s1: update t set a = 2 where ctid in (select ctid from t); -- UPDATE 1 s2: BEGIN s2: update t set a = 3 where ctid in (select ctid from t); -- wait row lock s1: COMMIT s2: -- UPDATE 0 -- ctid change so can't UPDATE one rows So we lost the s2 update 2. Sometimes it will crash when the columns of the created materialized view do not match Create table t( a int); create incremental materialized view s(z) as select sum(1) as a, sum(1) as b from t; The problem should be that colNames in rewriteQueryForIMMV does not consider this situation 3. Sometimes no error when the columns of the created materialized view do not match Create table t( a int); create incremental materialized view s(y,z) as select count(1) as b from t; But the hidden column of IMV is overwritten to z which will cause refresh failed. The problem should be that checkRuleResultList we should only skip imv hidden columns check 4. A unique index should not be created in the case of a Cartesian product create table base_a (i int primary key, j varchar); create table base_b (i int primary key, k varchar); INSERT INTO base_a VALUES (1,10), (2,20), (3,30), (4,40), (5,50); INSERT INTO base_b VALUES (1,101), (2,102), (3,103), (4,104); CREATE incremental MATERIALIZED VIEW s as select base_a.i,base_a.j from base_a,base_b; — create error because of unique index 5. Besides, I would like to ask you if you have considered implementing an IMV with delayed refresh? The advantage of delayed refresh is that it will not have much impact on write performance I probably have some ideas about it now, do you think it works? 1. After the base table is updated, the delayed IMV's after trigger is used to record the delta information in another table similar to the incremental log of the base table 2. When incremental refresh, use the data in the log instead of the data in the trasient table of the after trigger 3. We need to merge the incremental information in advance to ensure that the base_table after transaction filtering UNION ALL old_delta is the state before the base table is updated Case like below: Create table t( a int); —begin to record log Insert into t select 1; — newlog: 1 oldlog: empty Delete from t; —newlog:1, oldlog:1 — begin to incremental refresh Select * from t where xmin < xid or (xmin = xid and cmin < cid); — empty So this union all oldlog is not equal to before the base table is updated We need merge the incremental log in advance to make newlog: empty, oldlog: empty If implemented, incremental refresh must still be serialized, but the DML of the base table can not be blocked, that is to say, the base table can still record logs during incremental
Re: Implementing Incremental View Maintenance
Hi huyajun, Thank you for your comments! On Wed, 29 Jun 2022 17:56:39 +0800 huyajun wrote: > Hi, Nagata-san > I read your patch with v27 version and has some new comments,I want to > discuss with you. > > 1. How about use DEPENDENCY_INTERNAL instead of DEPENDENCY_AUTO > when record dependence on trigger created by IMV.( related code is in the > end of CreateIvmTrigger) > Otherwise, User can use sql to drop trigger and corrupt IVM, > DEPENDENCY_INTERNAL is also semantically more correct > Crash case like: >create table t( a int); >create incremental materialized view s as select * from t; >drop trigger "IVM_trigger_”; >Insert into t values(1); We use DEPENDENCY_AUTO because we want to delete the triggers when REFRESH ... WITH NO DATA is performed on the materialized view in order to disable IVM. Triggers created with DEPENDENCY_INTERNAL cannot be dropped. Such triggers are re-created when REFRESH ... [WITH DATA] is performed. We can use DEPENDENCY_INTERNAL if we disable/enable such triggers instead of dropping/re-creating them, although users also can disable triggers using ALTER TRIGGER. > 2. In get_matching_condition_string, Considering NULL values, we can not use > simple = operator. > But how about 'record = record', record_eq treat NULL = NULL > it should fast than current implementation for only one comparation > Below is my simple implementation with this, Variables are named arbitrarily.. > I test some cases it’s ok > > static char * > get_matching_condition_string(List *keys) > { > StringInfoData match_cond; > ListCell*lc; > > /* If there is no key columns, the condition is always true. */ > if (keys == NIL) > return "true"; > else > { > StringInfoData s1; > StringInfoData s2; > initStringInfo(_cond); > initStringInfo(); > initStringInfo(); > /* Considering NULL values, we can not use simple = operator. */ > appendStringInfo(, "ROW("); > appendStringInfo(, "ROW("); > foreach (lc, keys) > { > Form_pg_attribute attr = (Form_pg_attribute) lfirst(lc); > char *resname = NameStr(attr->attname); > char *mv_resname = quote_qualified_identifier("mv", resname); > char *diff_resname = quote_qualified_identifier("diff", > resname); > > appendStringInfo(, "%s", mv_resname); > appendStringInfo(, "%s", diff_resname); > > if (lnext(lc)) > { > appendStringInfo(, ", "); > appendStringInfo(, ", "); > } > } > appendStringInfo(, ")::record"); > appendStringInfo(, ")::record"); > appendStringInfo(_cond, "%s operator(pg_catalog.=) %s", > s1.data, s2.data); > return match_cond.data; > } > } As you say, we don't have to use IS NULL if we use ROW(...)::record, but we cannot use an index in this case and it makes IVM ineffecient. As showed bellow (#5), an index works even when we use simple = operations together with together "IS NULL" operations. > 3. Consider truncate base tables, IVM will not refresh, maybe raise an error > will be better I fixed to support TRUNCATE on base tables in our repository. https://github.com/sraoss/pgsql-ivm/commit/a1365ed69f34e1adbd160f2ce8fd1e80e032392f When a base table is truncated, the view content will be empty if the view definition query does not contain an aggregate without a GROUP clause. Therefore, such views can be truncated. Aggregate views without a GROUP clause always have one row. Therefore, if a base table is truncated, the view will not be empty and will contain a row with NULL value (or 0 for count()). So, in this case, we refresh the view instead of truncating it. The next version of the patch-set will include this change. > 4. In IVM_immediate_before,I know Lock base table with ExclusiveLock is > for concurrent updates to the IVM correctly, But how about to Lock it when > actually > need to maintain MV which in IVM_immediate_maintenance > In this way you don't have to lock multiple times. Yes, as you say, we don't have to lock the view multiple times. I'll investigate better locking ways including the way that you suggest. > 5. Why we need CreateIndexOnIMMV, is it a optimize? >It seems like when maintenance MV, >the index may not be used because of our match conditions can’t use > simple = operator No, the index works even when we use simple = operator together with "IS NULL". For example: postgres=# \d mv Materialized view "public.mv" Column | Type | Collation | Nullable | Default +-+---+--+- id | integer | | | v1 | integer | | | v2 | integer | | | Indexes: "mv_index" UNIQUE, btree (id) NULLS NOT DISTINCT postgres=# EXPLAIN ANALYZE WITH
Re: Implementing Incremental View Maintenance
> 2022年4月22日 下午1:58,Yugo NAGATA 写道: > > On Fri, 22 Apr 2022 11:29:39 +0900 > Yugo NAGATA wrote: > >> Hi, >> >> On Fri, 1 Apr 2022 11:09:16 -0400 >> Greg Stark wrote: >> >>> This patch has bitrotted due to some other patch affecting trigger.c. >>> >>> Could you post a rebase? >>> >>> This is the last week of the CF before feature freeze so time is of the >>> essence. >> >> I attached a rebased patch-set. >> >> Also, I made the folowing changes from the previous. >> >> 1. Fix to not use a new deptye >> >> In the previous patch, we introduced a new deptye 'm' into pg_depend. >> This deptype was used for looking for IVM triggers to be removed at >> REFRESH WITH NO DATA. However, we decided to not use it for reducing >> unnecessary change in the core code. Currently, the trigger name and >> dependent objclass are used at that time instead of it. >> >> As a result, the number of patches are reduced to nine from ten. > > >> 2. Bump the version numbers in psql and pg_dump >> >> This feature's target is PG 16 now. > > Sorry, I revert this change. It was too early to bump up the > version number. > > -- > Yugo NAGATA > > Hi, Nagata-san I read your patch with v27 version and has some new comments,I want to discuss with you. 1. How about use DEPENDENCY_INTERNAL instead of DEPENDENCY_AUTO when record dependence on trigger created by IMV.( related code is in the end of CreateIvmTrigger) Otherwise, User can use sql to drop trigger and corrupt IVM, DEPENDENCY_INTERNAL is also semantically more correct Crash case like: create table t( a int); create incremental materialized view s as select * from t; drop trigger "IVM_trigger_”; Insert into t values(1); 2. In get_matching_condition_string, Considering NULL values, we can not use simple = operator. But how about 'record = record', record_eq treat NULL = NULL it should fast than current implementation for only one comparation Below is my simple implementation with this, Variables are named arbitrarily.. I test some cases it’s ok static char * get_matching_condition_string(List *keys) { StringInfoData match_cond; ListCell*lc; /* If there is no key columns, the condition is always true. */ if (keys == NIL) return "true"; else { StringInfoData s1; StringInfoData s2; initStringInfo(_cond); initStringInfo(); initStringInfo(); /* Considering NULL values, we can not use simple = operator. */ appendStringInfo(, "ROW("); appendStringInfo(, "ROW("); foreach (lc, keys) { Form_pg_attribute attr = (Form_pg_attribute) lfirst(lc); char *resname = NameStr(attr->attname); char *mv_resname = quote_qualified_identifier("mv", resname); char *diff_resname = quote_qualified_identifier("diff", resname); appendStringInfo(, "%s", mv_resname); appendStringInfo(, "%s", diff_resname); if (lnext(lc)) { appendStringInfo(, ", "); appendStringInfo(, ", "); } } appendStringInfo(, ")::record"); appendStringInfo(, ")::record"); appendStringInfo(_cond, "%s operator(pg_catalog.=) %s", s1.data, s2.data); return match_cond.data; } } 3. Consider truncate base tables, IVM will not refresh, maybe raise an error will be better 4. In IVM_immediate_before,I know Lock base table with ExclusiveLock is for concurrent updates to the IVM correctly, But how about to Lock it when actually need to maintain MV which in IVM_immediate_maintenance In this way you don't have to lock multiple times. 5. Why we need CreateIndexOnIMMV, is it a optimize? It seems like when maintenance MV, the index may not be used because of our match conditions can’t use simple = operator Looking forward to your early reply to answer my above doubts, thank you a lot! Regards, Yajun Hu
Re: Implementing Incremental View Maintenance
Hello Greg, On Sat, 23 Apr 2022 08:18:01 +0200 Greg Stark wrote: > I'm trying to figure out how to get this feature more attention. Everyone > agrees it would be a huge help but it's a scary patch to review. > > I wonder if it would be helpful to have a kind of "readers guide" > explanation of the patches to help a reviewer understand what the point of > each patch is and how the whole system works? I think Andres and Robert > have both taken that approach before with big patches and it really helped > imho. Thank you very much for your suggestion! Following your advice, I am going to write a readers guide referring to the past posts of Andres and Rebert. Regards, Yugo Nagata -- Yugo NAGATA
Re: Implementing Incremental View Maintenance
I'm trying to figure out how to get this feature more attention. Everyone agrees it would be a huge help but it's a scary patch to review. I wonder if it would be helpful to have a kind of "readers guide" explanation of the patches to help a reviewer understand what the point of each patch is and how the whole system works? I think Andres and Robert have both taken that approach before with big patches and it really helped imho. On Fri., Apr. 22, 2022, 08:01 Yugo NAGATA, wrote: > On Fri, 22 Apr 2022 11:29:39 +0900 > Yugo NAGATA wrote: > > > Hi, > > > > On Fri, 1 Apr 2022 11:09:16 -0400 > > Greg Stark wrote: > > > > > This patch has bitrotted due to some other patch affecting trigger.c. > > > > > > Could you post a rebase? > > > > > > This is the last week of the CF before feature freeze so time is of > the essence. > > > > I attached a rebased patch-set. > > > > Also, I made the folowing changes from the previous. > > > > 1. Fix to not use a new deptye > > > > In the previous patch, we introduced a new deptye 'm' into pg_depend. > > This deptype was used for looking for IVM triggers to be removed at > > REFRESH WITH NO DATA. However, we decided to not use it for reducing > > unnecessary change in the core code. Currently, the trigger name and > > dependent objclass are used at that time instead of it. > > > > As a result, the number of patches are reduced to nine from ten. > > > > 2. Bump the version numbers in psql and pg_dump > > > > This feature's target is PG 16 now. > > Sorry, I revert this change. It was too early to bump up the > version number. > > -- > Yugo NAGATA >
Re: Implementing Incremental View Maintenance
This patch has bitrotted due to some other patch affecting trigger.c. Could you post a rebase? This is the last week of the CF before feature freeze so time is of the essence.
Re: Implementing Incremental View Maintenance
Hello Zhihong Yu, I already replied to your comments before, but I forgot to include the list to CC, so I resend the same again. Sorry for the duplicate emails. On Thu, 3 Feb 2022 09:51:52 -0800 Zhihong Yu wrote: > For CreateIndexOnIMMV(): > > + ereport(NOTICE, > + (errmsg("could not create an index on materialized view > \"%s\" automatically", > ... > + return; > + } > > Should the return type be changed to bool so that the caller knows whether > the index creation succeeds ? > If index creation is unsuccessful, should the call > to CreateIvmTriggersOnBaseTables() be skipped ? CreateIvmTriggersOnBaseTables() have to be called regardless of whether an index is created successfully or not, so I think CreateindexOnIMMV() doesn't have to return the result for now. > For check_ivm_restriction_walker(): > > + break; > + expression_tree_walker(node, check_ivm_restriction_walker, > NULL); > + break; > > Something is missing between the break and expression_tree_walker(). Yes, it's my mistake during making the patch-set. I fixed it in the updated patch I attached in the other post. Regards, Yugo Nagata -- Yugo NAGATA
Re: Implementing Incremental View Maintenance
On Wed, 16 Feb 2022 22:34:18 +0800 huyajun wrote: > Hi, Nagata-san > I am very interested in IMMV and read your patch but have some comments in > v25-0007-Add-Incremental-View-Maintenance-support.patch and want to discuss > with you. Thank you for your review! > > + /* For IMMV, we need to rewrite matview query */ > + query = rewriteQueryForIMMV(query, into->colNames); > + query_immv = copyObject(query); > > /* Create triggers on incremental maintainable materialized view */ > + Assert(query_immv != NULL); > + CreateIvmTriggersOnBaseTables(query_immv, > matviewOid, true); > 1. Do we need copy query?Is it okay that CreateIvmTriggersOnBaseTables > directly use (Query *) into->viewQuery instead of query_immv like > CreateIndexOnIMMV? It seems only planner may change query, but it shouldn't > affect us finding the correct base table in CreateIvmTriggersOnBaseTables . The copy to query_immv was necessary for supporting sub-queries in the view definition. However, we excluded the fueature from the current patch to reduce the patch size, so it would be unnecessary. I'll fix it. > > +void > +CreateIndexOnIMMV(Query *query, Relation matviewRel) > +{ > + Query *qry = (Query *) copyObject(query); > 2. Also, is it okay to not copy query in CreateIndexOnIMMV? It seems we only > read query in CreateIndexOnIMMV. This was also necessary for supporting CTEs, but unnecessary in the current patch, so I'll fix it, too. Regards, Yugo Nagata -- Yugo NAGATA
Re: Implementing Incremental View Maintenance
Hi, Nagata-san I am very interested in IMMV and read your patch but have some comments in v25-0007-Add-Incremental-View-Maintenance-support.patch and want to discuss with you. + /* For IMMV, we need to rewrite matview query */ + query = rewriteQueryForIMMV(query, into->colNames); + query_immv = copyObject(query); /* Create triggers on incremental maintainable materialized view */ + Assert(query_immv != NULL); + CreateIvmTriggersOnBaseTables(query_immv, matviewOid, true); 1. Do we need copy query?Is it okay that CreateIvmTriggersOnBaseTables directly use (Query *) into->viewQuery instead of query_immv like CreateIndexOnIMMV? It seems only planner may change query, but it shouldn't affect us finding the correct base table in CreateIvmTriggersOnBaseTables . +void +CreateIndexOnIMMV(Query *query, Relation matviewRel) +{ + Query *qry = (Query *) copyObject(query); 2. Also, is it okay to not copy query in CreateIndexOnIMMV? It seems we only read query in CreateIndexOnIMMV. Regards,
Re: Implementing Incremental View Maintenance
On Thu, Feb 3, 2022 at 8:50 AM Yugo NAGATA wrote: > On Thu, 3 Feb 2022 08:48:00 -0800 > Zhihong Yu wrote: > > > On Thu, Feb 3, 2022 at 8:28 AM Yugo NAGATA wrote: > > > > > Hi, > > > > > > On Thu, 13 Jan 2022 18:23:42 +0800 > > > Julien Rouhaud wrote: > > > > > > > Hi, > > > > > > > > On Thu, Nov 25, 2021 at 04:37:10PM +0900, Yugo NAGATA wrote: > > > > > On Wed, 24 Nov 2021 04:31:25 + > > > > > "r.takahash...@fujitsu.com" wrote: > > > > > > > > > > > > > > > > > I checked the same procedure on v24 patch. > > > > > > But following error occurs instead of the original error. > > > > > > > > > > > > ERROR: relation "ivm_t_index" already exists > > > > > > > > > > Thank you for pointing out it! > > > > > > > > > > Hmmm, an index is created when IMMV is defined, so CREAE INDEX > called > > > > > after this would fail... Maybe, we should not create any index > > > automatically > > > > > if IMMV is created WITH NO DATA. > > > > > > > > > > I'll fix it after some investigation. > > > > > > > > Are you still investigating on that problem? Also, the patchset > doesn't > > > apply > > > > anymore: > > > > > > I attached the updated and rebased patch set. > > > > > > I fixed to not create a unique index when an IMMV is created WITH NO > DATA. > > > Instead, the index is created by REFRESH WITH DATA only when the same > one > > > is not created yet. > > > > > > Also, I fixed the documentation to describe that foreign tables and > > > partitioned > > > tables are not supported according with Takahashi-san's suggestion. > > > > > > > There isn't any answer to your following email summarizing the > feature > > > yet, so > > > > I'm not sure what should be the status of this patch, as there's no > ideal > > > > category for that. For now I'll change the patch to Waiting on > Author > > > on the > > > > cf app, feel free to switch it back to Needs Review if you think it's > > > more > > > > suitable, at least for the design discussion need. > > > > > > I changed the status to Needs Review. > > > > > > > > > Hi, > > Did you intend to attach updated patch ? > > > > I don't seem to find any. > > Oops, I attached. Thanks! > > Hi, For CreateIndexOnIMMV(): + ereport(NOTICE, + (errmsg("could not create an index on materialized view \"%s\" automatically", ... + return; + } Should the return type be changed to bool so that the caller knows whether the index creation succeeds ? If index creation is unsuccessful, should the call to CreateIvmTriggersOnBaseTables() be skipped ? For check_ivm_restriction_walker(): + break; + expression_tree_walker(node, check_ivm_restriction_walker, NULL); + break; Something is missing between the break and expression_tree_walker(). Cheers
Re: Implementing Incremental View Maintenance
On Thu, Feb 3, 2022 at 8:28 AM Yugo NAGATA wrote: > Hi, > > On Thu, 13 Jan 2022 18:23:42 +0800 > Julien Rouhaud wrote: > > > Hi, > > > > On Thu, Nov 25, 2021 at 04:37:10PM +0900, Yugo NAGATA wrote: > > > On Wed, 24 Nov 2021 04:31:25 + > > > "r.takahash...@fujitsu.com" wrote: > > > > > > > > > > > I checked the same procedure on v24 patch. > > > > But following error occurs instead of the original error. > > > > > > > > ERROR: relation "ivm_t_index" already exists > > > > > > Thank you for pointing out it! > > > > > > Hmmm, an index is created when IMMV is defined, so CREAE INDEX called > > > after this would fail... Maybe, we should not create any index > automatically > > > if IMMV is created WITH NO DATA. > > > > > > I'll fix it after some investigation. > > > > Are you still investigating on that problem? Also, the patchset doesn't > apply > > anymore: > > I attached the updated and rebased patch set. > > I fixed to not create a unique index when an IMMV is created WITH NO DATA. > Instead, the index is created by REFRESH WITH DATA only when the same one > is not created yet. > > Also, I fixed the documentation to describe that foreign tables and > partitioned > tables are not supported according with Takahashi-san's suggestion. > > > There isn't any answer to your following email summarizing the feature > yet, so > > I'm not sure what should be the status of this patch, as there's no ideal > > category for that. For now I'll change the patch to Waiting on Author > on the > > cf app, feel free to switch it back to Needs Review if you think it's > more > > suitable, at least for the design discussion need. > > I changed the status to Needs Review. > > > Hi, Did you intend to attach updated patch ? I don't seem to find any. FYI
Re: Implementing Incremental View Maintenance
Hi, On Thu, 13 Jan 2022 18:23:42 +0800 Julien Rouhaud wrote: > Hi, > > On Thu, Nov 25, 2021 at 04:37:10PM +0900, Yugo NAGATA wrote: > > On Wed, 24 Nov 2021 04:31:25 + > > "r.takahash...@fujitsu.com" wrote: > > > > > > > > I checked the same procedure on v24 patch. > > > But following error occurs instead of the original error. > > > > > > ERROR: relation "ivm_t_index" already exists > > > > Thank you for pointing out it! > > > > Hmmm, an index is created when IMMV is defined, so CREAE INDEX called > > after this would fail... Maybe, we should not create any index automatically > > if IMMV is created WITH NO DATA. > > > > I'll fix it after some investigation. > > Are you still investigating on that problem? Also, the patchset doesn't apply > anymore: I attached the updated and rebased patch set. I fixed to not create a unique index when an IMMV is created WITH NO DATA. Instead, the index is created by REFRESH WITH DATA only when the same one is not created yet. Also, I fixed the documentation to describe that foreign tables and partitioned tables are not supported according with Takahashi-san's suggestion. > There isn't any answer to your following email summarizing the feature yet, so > I'm not sure what should be the status of this patch, as there's no ideal > category for that. For now I'll change the patch to Waiting on Author on the > cf app, feel free to switch it back to Needs Review if you think it's more > suitable, at least for the design discussion need. I changed the status to Needs Review. Regards, Yugo Nagata -- Yugo NAGATA
Re: Implementing Incremental View Maintenance
Hi, On Thu, Nov 25, 2021 at 04:37:10PM +0900, Yugo NAGATA wrote: > On Wed, 24 Nov 2021 04:31:25 + > "r.takahash...@fujitsu.com" wrote: > > > > > I checked the same procedure on v24 patch. > > But following error occurs instead of the original error. > > > > ERROR: relation "ivm_t_index" already exists > > Thank you for pointing out it! > > Hmmm, an index is created when IMMV is defined, so CREAE INDEX called > after this would fail... Maybe, we should not create any index automatically > if IMMV is created WITH NO DATA. > > I'll fix it after some investigation. Are you still investigating on that problem? Also, the patchset doesn't apply anymore: http://cfbot.cputube.org/patch_36_2138.log === Applying patches on top of PostgreSQL commit ID a18b6d2dc288dfa6e7905ede1d4462edd6a8af47 === [...] === applying patch ./v24-0005-Add-Incremental-View-Maintenance-support-to-pg_d.patch patching file src/bin/pg_dump/pg_dump.c Hunk #1 FAILED at 6393. Hunk #2 FAILED at 6596. Hunk #3 FAILED at 6719. Hunk #4 FAILED at 6796. Hunk #5 succeeded at 14953 (offset -915 lines). 4 out of 5 hunks FAILED -- saving rejects to file src/bin/pg_dump/pg_dump.c.rej There isn't any answer to your following email summarizing the feature yet, so I'm not sure what should be the status of this patch, as there's no ideal category for that. For now I'll change the patch to Waiting on Author on the cf app, feel free to switch it back to Needs Review if you think it's more suitable, at least for the design discussion need.
Re: Implementing Incremental View Maintenance
Hi hackers, This is a response to a comment in "Commitfest 2021-11 Patch Triage - Part 1" [1]. > 2138: Incremental Materialized View Maintenance > === > There seems to be concensus on the thread that this is a feature that we want, > and after initial design discussions there seems to be no disagreements with > the approach taken. The patch was marked ready for committer almost a year > ago, but have since been needs review (which seems correct). The size of the > patchset and the length of the thread make it hard to gauge just far away it > is, maybe the author or a review can summarize the current state and outline > what is left for it to be committable. [1] https://www.postgresql.org/message-id/6EDAAF93-1663-41D0-9148-76739104943E%40yesql.se I'll describe recent discussions and current status of this thread. * Recent Discussions and Current status 1. Previously, we proposed a patchset that supports outer-joins, some sub-queries and CTEs. However, aiming to reduce the size of the patchset, I proposed to omit these features from the first version of the patch in my post at 2021-08-02 [2]. Currently, we are proposing Incremental View Maintenance feature for PostgreSQL 15 that supports following queries in the view definition query. - inner joins including self-join - DISTINCT and views with tuple duplicates - some built-in aggregate functions (count, sum, agv, min, and max) Is it OK? Although there has been no opposite opinion, we want to confirm it. [2] https://www.postgresql.org/message-id/20210802152834.ecbaba6e17d1957547c3a55d%40sraoss.co.jp 2. Recently, There was a suggestion that we should support partitioned tables from Ryohei Takahashi, but I decided to not support it in the first release of IVM. Takahshi-san agreed with it, and the documentation will be fixed soon [3]. [3] https://www.postgresql.org/message-id/20211125154717.777e9d35ddde5f2e0d5d8355%40sraoss.co.jp 3. Takahashi-san pointed out that restoring pg_dump results causes an error. I am fixing it now.[4] [4] https://www.postgresql.org/message-id/20211125163710.2f32ae3d4be5d5f9ade020b6%40sraoss.co.jp The remaining is the summary of our proposal of IVM feature, its design, and past discussions. --- * Features Incremental View Maintenance (IVM) is a way to make materialized views up-to-date by computing only incremental changes and applying them on views. IVM is more efficient than REFRESH MATERIALIZED VIEW when only small parts of the view are changed. This patchset provides a feature that allows materialized views to be updated automatically and incrementally just after a underlying table is modified. You can create an incementally maintainable materialized view (IMMV) by using CREATE INCREMENTAL MATERIALIZED VIEW command. The followings are supported in view definition queries: - SELECT ... FROM ... WHERE ..., joins (inner joins, self-joins) - some built-in aggregate functions (count, sum, avg, min, max) - GROUP BY clause - DISTINCT clause Views can contain multiple tuples with the same content (duplicate tuples). The following are not supported in a view definition: - Outer joins - Aggregates otehr than above, window functions, HAVING - Sub-queries, CTEs - Set operations (UNION, INTERSECT, EXCEPT) - DISTINCT ON, ORDER BY, LIMIT, OFFSET Also, a view definition query cannot contain other views, materialized views, foreign tables, partitioned tables, partitions, VALUES, non-immutable functions, system columns, or expressions that contains aggregates. --- * Design An IMMV is maintained using statement-level AFTER triggers. When an IMMV is created, triggers are automatically created on all base tables contained in the view definition query. When a table is modified, the change that occurred in the table are extracted as transition tables in the AFTER triggers. Then, changes that will occur in the view are calculated by a rewritten view dequery in which the modified table is replaced with the transition table. For example, if the view is defined as "SELECT * FROM R, S", and tuples inserted into R are stored in a transiton table dR, the tuples that will be inserted into the view are calculated as the result of "SELECT * FROM dR, S". ** Multiple Tables Modification Multiple tables can be modified in a statement when using triggers, foreign key constraint, or modifying CTEs. When multiple tables are modified, we need the state of tables before the modification. For example, when some tuples, dR and dS, are inserted into R and S respectively, the tuples that will be inserted into the view are calculated by the following two queries: "SELECT * FROM dR, S_pre" "SELECT * FROM R, dS" where S_pre is the table before the modification, R is the current state of table, that is, after the
Re: Implementing Incremental View Maintenance
On Wed, 24 Nov 2021 04:31:25 + "r.takahash...@fujitsu.com" wrote: > > > ivm=# create table t (c1 int, c2 int); > > > CREATE TABLE > > > ivm=# create incremental materialized view ivm_t as select distinct c1 > > > from t; > > > NOTICE: created index "ivm_t_index" on materialized view "ivm_t" > > > SELECT 0 > > > > > > Then I executed pg_dump. > > > > > > In the dump, the following SQLs appear. > > > > > > CREATE INCREMENTAL MATERIALIZED VIEW public.ivm_t AS > > > SELECT DISTINCT t.c1 > > >FROM public.t > > > WITH NO DATA; > > > > > > ALTER TABLE ONLY public.ivm_t > > > ADD CONSTRAINT ivm_t_index UNIQUE (c1); > > > > > > If I execute psql with the result of pg_dump, following error occurs. > > > > > > ERROR: ALTER action ADD CONSTRAINT cannot be performed on relation > > "ivm_t" > > > DETAIL: This operation is not supported for materialized views. > > > > Good catch! It was my mistake creating unique constraints on IMMV in spite > > of > > we cannot defined them via SQL. I'll fix it to use unique indexes instead of > > constraints. > > I checked the same procedure on v24 patch. > But following error occurs instead of the original error. > > ERROR: relation "ivm_t_index" already exists Thank you for pointing out it! Hmmm, an index is created when IMMV is defined, so CREAE INDEX called after this would fail... Maybe, we should not create any index automatically if IMMV is created WITH NO DATA. I'll fix it after some investigation. Regards, Yugo Nagata -- Yugo NAGATA
Re: Implementing Incremental View Maintenance
Hello Takahashi-san, On Wed, 24 Nov 2021 04:27:13 + "r.takahash...@fujitsu.com" wrote: > Hi Nagata-san, > > > Sorry for late reply. > > > > However, even if we create triggers recursively on the parents or children, > > we would still > > need more consideration. This is because we will have to convert the format > > of tuple of > > modified table to the format of the table specified in the view for cases > > that the parent > > and some children have different format. > > > > I think supporting partitioned tables can be left for the next release. > > OK. I understand. > In the v24-patch, creating IVM on partions or partition table is prohibited. > It is OK but it should be documented. > > Perhaps, the following statement describe this. > If so, I think the definition of "simple base table" is ambiguous for some > users. > > + IMMVs must be based on simple base tables. It's not supported to > + create them on top of views or materialized views. Oh, I forgot to fix the documentation. I'll fix it. Ragards, Yugo Nagata -- Yugo NAGATA
RE: Implementing Incremental View Maintenance
Hi Nagata-san, > Ok. I'll fix _copyIntoClause() and _equalIntoClause() as well as > _readIntoClause() > and _outIntoClause(). OK. > > ivm=# create table t (c1 int, c2 int); > > CREATE TABLE > > ivm=# create incremental materialized view ivm_t as select distinct c1 from > > t; > > NOTICE: created index "ivm_t_index" on materialized view "ivm_t" > > SELECT 0 > > > > Then I executed pg_dump. > > > > In the dump, the following SQLs appear. > > > > CREATE INCREMENTAL MATERIALIZED VIEW public.ivm_t AS > > SELECT DISTINCT t.c1 > >FROM public.t > > WITH NO DATA; > > > > ALTER TABLE ONLY public.ivm_t > > ADD CONSTRAINT ivm_t_index UNIQUE (c1); > > > > If I execute psql with the result of pg_dump, following error occurs. > > > > ERROR: ALTER action ADD CONSTRAINT cannot be performed on relation > "ivm_t" > > DETAIL: This operation is not supported for materialized views. > > Good catch! It was my mistake creating unique constraints on IMMV in spite of > we cannot defined them via SQL. I'll fix it to use unique indexes instead of > constraints. I checked the same procedure on v24 patch. But following error occurs instead of the original error. ERROR: relation "ivm_t_index" already exists Regards, Ryohei Takahashi
RE: Implementing Incremental View Maintenance
Hi Nagata-san, Sorry for late reply. > However, even if we create triggers recursively on the parents or children, > we would still > need more consideration. This is because we will have to convert the format > of tuple of > modified table to the format of the table specified in the view for cases > that the parent > and some children have different format. > > I think supporting partitioned tables can be left for the next release. OK. I understand. In the v24-patch, creating IVM on partions or partition table is prohibited. It is OK but it should be documented. Perhaps, the following statement describe this. If so, I think the definition of "simple base table" is ambiguous for some users. + IMMVs must be based on simple base tables. It's not supported to + create them on top of views or materialized views. > DEPENDENCY_IMMV was added to clear that a certain trigger is related to IMMV. > We dropped the IVM trigger and its dependencies from IMMV when REFRESH ... > WITH NO DATA > is executed. Without the new deptype, we may accidentally delete a dependency > created > with an intention other than the IVM trigger. OK. I understand. > I think it is harder than you expected. When an IMMV is switched to a normal > materialized view, we needs to drop hidden columns (__ivm_count__ etc.), and > in > the opposite case, we need to create them again. The former (IMMV->IVM) might > be > easer, but for the latter (IVM->IMMV) I wonder we would need to re-create > IMMV. OK. I understand. Regards, Ryohei Takahashi
Re: Implementing Incremental View Maintenance
Hello Takahashi-san, On Wed, 22 Sep 2021 18:53:43 +0900 Yugo NAGATA wrote: > Hello Takahashi-san, > > On Thu, 5 Aug 2021 08:53:47 + > "r.takahash...@fujitsu.com" wrote: > > > Hi Nagata-san, > > > > > > Thank you for your reply. > > > > > I'll investigate this more, but we may have to prohibit views on > > > partitioned > > > table and partitions. > > > > I think this restriction is strict. > > This feature is useful when the base table is large and partitioning is > > also useful in such case. > > One reason of this issue is the lack of triggers on partitioned tables or > partitions that > are not specified in the view definition. > > However, even if we create triggers recursively on the parents or children, > we would still > need more consideration. This is because we will have to convert the format > of tuple of > modified table to the format of the table specified in the view for cases > that the parent > and some children have different format. > > I think supporting partitioned tables can be left for the next release. > > > > > I have several additional comments on the patch. > > > > > > (1) > > The following features are added to transition table. > > - Prolong lifespan of transition table > > - If table has row security policies, set them to the transition table > > - Calculate pre-state of the table > > > > Are these features only for IVM? > > If there are other useful case, they should be separated from IVM patch and > > should be independent patch for transition table. > > Maybe. However, we don't have good idea about use cases other than IVM of > them for now... > > > > > (2) > > DEPENDENCY_IMMV (m) is added to deptype of pg_depend. > > What is the difference compared with existing deptype such as > > DEPENDENCY_INTERNAL (i)? > > DEPENDENCY_IMMV was added to clear that a certain trigger is related to IMMV. > We dropped the IVM trigger and its dependencies from IMMV when REFRESH ... > WITH NO DATA > is executed. Without the new deptype, we may accidentally delete a dependency > created > with an intention other than the IVM trigger. > > > (3) > > Converting from normal materialized view to IVM or from IVM to normal > > materialized view is not implemented yet. > > Is it difficult? > > > > I think create/drop triggers and __ivm_ columns can achieve this feature. > > I think it is harder than you expected. When an IMMV is switched to a normal > materialized view, we needs to drop hidden columns (__ivm_count__ etc.), and > in > the opposite case, we need to create them again. The former (IMMV->IVM) might > be > easer, but for the latter (IVM->IMMV) I wonder we would need to re-create > IMMV. I am sorry but I found a mistake in the above description. "IMMV->IVM" and "IVM->IMMV" were wrong. I've should use "IMMV->MV" and "MV->IMMV" where MV means normal materialized view.w. Regards, Yugo Nagata -- Yugo NAGATA
Re: Implementing Incremental View Maintenance
Hi hackers, I attached the updated patch including fixes reported by Zhihong Yu and Ryohei Takahashi. Regards, Yugo Nagata On Wed, 22 Sep 2021 19:12:27 +0900 Yugo NAGATA wrote: > Hello Takahashi-san, > > On Mon, 6 Sep 2021 10:06:37 + > "r.takahash...@fujitsu.com" wrote: > > > Hi Nagata-san, > > > > > > I'm still reading the patch. > > I have additional comments. > > Thank you for your comments! > > > > > (1) > > In v23-0001-Add-a-syntax-to-create-Incrementally-Maintainabl.patch, ivm > > member is added to IntoClause struct. > > I think it is necessary to modify _copyIntoClause() and _equalIntoClause() > > functions. > > Ok. I'll fix _copyIntoClause() and _equalIntoClause() as well as > _readIntoClause() and _outIntoClause(). > > > (2) > > By executing pg_dump with > > v23-0005-Add-Incremental-View-Maintenance-support-to-pg_d.patch, > > the constraint which is automatically created during "CREATE INCREMENTAL > > MATERIALIZED VIEW" is also dumped. > > This cause error during recovery as follows. > > > > ivm=# create table t (c1 int, c2 int); > > CREATE TABLE > > ivm=# create incremental materialized view ivm_t as select distinct c1 from > > t; > > NOTICE: created index "ivm_t_index" on materialized view "ivm_t" > > SELECT 0 > > > > Then I executed pg_dump. > > > > In the dump, the following SQLs appear. > > > > CREATE INCREMENTAL MATERIALIZED VIEW public.ivm_t AS > > SELECT DISTINCT t.c1 > >FROM public.t > > WITH NO DATA; > > > > ALTER TABLE ONLY public.ivm_t > > ADD CONSTRAINT ivm_t_index UNIQUE (c1); > > > > If I execute psql with the result of pg_dump, following error occurs. > > > > ERROR: ALTER action ADD CONSTRAINT cannot be performed on relation "ivm_t" > > DETAIL: This operation is not supported for materialized views. > > Good catch! It was my mistake creating unique constraints on IMMV in spite of > we cannot defined them via SQL. I'll fix it to use unique indexes instead of > constraints. > > Regards, > Yugo Nagata > > -- > Yugo NAGATA > > -- Yugo NAGATA IVM_patches_v24.tar.gz Description: application/gzip
Re: Implementing Incremental View Maintenance
Hello Takahashi-san, On Mon, 6 Sep 2021 10:06:37 + "r.takahash...@fujitsu.com" wrote: > Hi Nagata-san, > > > I'm still reading the patch. > I have additional comments. Thank you for your comments! > > (1) > In v23-0001-Add-a-syntax-to-create-Incrementally-Maintainabl.patch, ivm > member is added to IntoClause struct. > I think it is necessary to modify _copyIntoClause() and _equalIntoClause() > functions. Ok. I'll fix _copyIntoClause() and _equalIntoClause() as well as _readIntoClause() and _outIntoClause(). > (2) > By executing pg_dump with > v23-0005-Add-Incremental-View-Maintenance-support-to-pg_d.patch, > the constraint which is automatically created during "CREATE INCREMENTAL > MATERIALIZED VIEW" is also dumped. > This cause error during recovery as follows. > > ivm=# create table t (c1 int, c2 int); > CREATE TABLE > ivm=# create incremental materialized view ivm_t as select distinct c1 from t; > NOTICE: created index "ivm_t_index" on materialized view "ivm_t" > SELECT 0 > > Then I executed pg_dump. > > In the dump, the following SQLs appear. > > CREATE INCREMENTAL MATERIALIZED VIEW public.ivm_t AS > SELECT DISTINCT t.c1 >FROM public.t > WITH NO DATA; > > ALTER TABLE ONLY public.ivm_t > ADD CONSTRAINT ivm_t_index UNIQUE (c1); > > If I execute psql with the result of pg_dump, following error occurs. > > ERROR: ALTER action ADD CONSTRAINT cannot be performed on relation "ivm_t" > DETAIL: This operation is not supported for materialized views. Good catch! It was my mistake creating unique constraints on IMMV in spite of we cannot defined them via SQL. I'll fix it to use unique indexes instead of constraints. Regards, Yugo Nagata -- Yugo NAGATA
Re: Implementing Incremental View Maintenance
Hello Takahashi-san, On Thu, 5 Aug 2021 08:53:47 + "r.takahash...@fujitsu.com" wrote: > Hi Nagata-san, > > > Thank you for your reply. > > > I'll investigate this more, but we may have to prohibit views on partitioned > > table and partitions. > > I think this restriction is strict. > This feature is useful when the base table is large and partitioning is also > useful in such case. One reason of this issue is the lack of triggers on partitioned tables or partitions that are not specified in the view definition. However, even if we create triggers recursively on the parents or children, we would still need more consideration. This is because we will have to convert the format of tuple of modified table to the format of the table specified in the view for cases that the parent and some children have different format. I think supporting partitioned tables can be left for the next release. > > I have several additional comments on the patch. > > > (1) > The following features are added to transition table. > - Prolong lifespan of transition table > - If table has row security policies, set them to the transition table > - Calculate pre-state of the table > > Are these features only for IVM? > If there are other useful case, they should be separated from IVM patch and > should be independent patch for transition table. Maybe. However, we don't have good idea about use cases other than IVM of them for now... > > (2) > DEPENDENCY_IMMV (m) is added to deptype of pg_depend. > What is the difference compared with existing deptype such as > DEPENDENCY_INTERNAL (i)? DEPENDENCY_IMMV was added to clear that a certain trigger is related to IMMV. We dropped the IVM trigger and its dependencies from IMMV when REFRESH ... WITH NO DATA is executed. Without the new deptype, we may accidentally delete a dependency created with an intention other than the IVM trigger. > (3) > Converting from normal materialized view to IVM or from IVM to normal > materialized view is not implemented yet. > Is it difficult? > > I think create/drop triggers and __ivm_ columns can achieve this feature. I think it is harder than you expected. When an IMMV is switched to a normal materialized view, we needs to drop hidden columns (__ivm_count__ etc.), and in the opposite case, we need to create them again. The former (IMMV->IVM) might be easer, but for the latter (IVM->IMMV) I wonder we would need to re-create IMMV. Regards, Yugo Nagata -- Yugo NAGATA
Re: Implementing Incremental View Maintenance
Hello Zhihong Yu, Thank you for your suggestion! I am sorry for late replay. I'll fix them and submit the updated patch soon. On Sat, 7 Aug 2021 00:52:24 -0700 Zhihong Yu wrote: > > Hi, > > For v23-0007-Add-Incremental-View-Maintenance-support.patch : > > > > bq. In this implementation, AFTER triggers are used to collecting > > tuplestores > > > > 'to collecting' -> to collect > > > > bq. are contained in a old transition table. > > > > 'a old' -> an old > > > > bq. updates of more than one base tables > > > > one base tables -> one base table I'll fix them. > > bq. DISTINCT and tuple duplicates in views are supported > > > > Since distinct and duplicate have opposite meanings, it would be better to > > rephrase the above sentence. I'll rewrite it to "Incrementally Maintainable Materialized Views (IMMV) can contain duplicated tuples. Also, DISTINCT clause is supported. " > > bq. The value in__ivm_count__ is updated > > > > I searched the patch for in__ivm_count__ - there was no (second) match. I > > think there should be a space between in and underscore. Yes, the space was missing. > > +static void CreateIvmTriggersOnBaseTables_recurse(Query *qry, Node *node, > > Oid matviewOid, Relids *relids, bool ex_lock); > > > > nit: long line. please wrap. OK. > > > > + if (rewritten->distinctClause) > > + rewritten->groupClause = transformDistinctClause(NULL, > > >targetList, rewritten->sortClause, false); > > + > > + /* Add count(*) for counting distinct tuples in views */ > > + if (rewritten->distinctClause) > > > > It seems the body of the two if statements can be combined into one. Ok. > > + CreateIvmTriggersOnBaseTables_recurse(qry, (Node *)qry, matviewOid, > , ex_lock); > > Looking at existing recursive functions, e.g. > > src/backend/executor/execPartition.c:find_matching_subplans_recurse(PartitionPruningData > *prunedata, > > the letters in the function name are all lower case. I think following the > convention would be nice. Ok. I'll rename this to CreateIvmTriggersOnBaseTablesRecurse since I found DeadLockCheckRecurse, transformExprRecurse, and so on. > > + if (rte->rtekind == RTE_RELATION) > + { > + if (!bms_is_member(rte->relid, *relids)) > > The conditions for the two if statements can be combined (saving some > indentation). Yes. I'll fix. > + check_stack_depth(); > + > + if (node == NULL) > + return false; > > It seems the node check can be placed ahead of the stack depth check. OK. > + * CreateindexOnIMMV > > CreateindexOnIMMV -> CreateIndexOnIMMV > > + (errmsg("could not create an index on materialized view > \"%s\" automatically", > > It would be nice to mention the reason is the lack of primary key. > > + /* create no index, just notice that an appropriate index is > necessary for efficient, IVM */ > > for efficient -> for efficiency. I'll fix them. Thanks. Regards, Yugo Nagata -- Yugo NAGATA
RE: Implementing Incremental View Maintenance
Hi Nagata-san, I'm still reading the patch. I have additional comments. (1) In v23-0001-Add-a-syntax-to-create-Incrementally-Maintainabl.patch, ivm member is added to IntoClause struct. I think it is necessary to modify _copyIntoClause() and _equalIntoClause() functions. (2) By executing pg_dump with v23-0005-Add-Incremental-View-Maintenance-support-to-pg_d.patch, the constraint which is automatically created during "CREATE INCREMENTAL MATERIALIZED VIEW" is also dumped. This cause error during recovery as follows. ivm=# create table t (c1 int, c2 int); CREATE TABLE ivm=# create incremental materialized view ivm_t as select distinct c1 from t; NOTICE: created index "ivm_t_index" on materialized view "ivm_t" SELECT 0 Then I executed pg_dump. In the dump, the following SQLs appear. CREATE INCREMENTAL MATERIALIZED VIEW public.ivm_t AS SELECT DISTINCT t.c1 FROM public.t WITH NO DATA; ALTER TABLE ONLY public.ivm_t ADD CONSTRAINT ivm_t_index UNIQUE (c1); If I execute psql with the result of pg_dump, following error occurs. ERROR: ALTER action ADD CONSTRAINT cannot be performed on relation "ivm_t" DETAIL: This operation is not supported for materialized views. Regards, Ryohei Takahashi
Re: Implementing Incremental View Maintenance
On Sat, Aug 7, 2021 at 12:00 AM Zhihong Yu wrote: > > > On Sun, Aug 1, 2021 at 11:30 PM Yugo NAGATA wrote: > >> Hi hackers, >> >> On Mon, 19 Jul 2021 09:24:30 +0900 >> Yugo NAGATA wrote: >> >> > On Wed, 14 Jul 2021 21:22:37 +0530 >> > vignesh C wrote: >> >> > > The patch does not apply on Head anymore, could you rebase and post a >> > > patch. I'm changing the status to "Waiting for Author". >> > >> > Ok. I'll update the patch in a few days. >> >> Attached is the latest patch set to add support for Incremental >> Materialized View Maintenance (IVM) >> >> The patches are rebased to the master and also revised with some >> code cleaning. >> >> IVM is a way to make materialized views up-to-date in which only >> incremental changes are computed and applied on views rather than >> recomputing the contents from scratch as REFRESH MATERIALIZED VIEW >> does. IVM can update materialized views more efficiently >> than recomputation when only small part of the view need updates. >> >> The patch set implements a feature so that materialized views could be >> updated automatically and immediately when a base table is modified. >> >> Currently, our IVM implementation supports views which could contain >> tuple duplicates whose definition includes: >> >> - inner and outer joins including self-join >> - DISTINCT >> - some built-in aggregate functions (count, sum, agv, min, and max) >> - a part of subqueries >>-- simple subqueries in FROM clause >>-- EXISTS subqueries in WHERE clause >> - CTEs >> >> We hope the IVM feature would be adopted into pg15. However, the size of >> patch set has grown too large through supporting above features. >> Therefore, >> I think it is better to consider only a part of these features for the >> first >> release. Especially, I would like propose the following features for pg15. >> >> - inner joins including self-join >> - DISTINCT and views with tuple duplicates >> - some built-in aggregate functions (count, sum, agv, min, and max) >> >> By omitting outer-join, sub-queries, and CTE features, the patch size >> becomes >> less than half. I hope this will make a bit easer to review the IVM patch >> set. >> >> Here is a list of separated patches. >> >> - 0001: Add a new syntax: >> CREATE INCREMENTAL MATERIALIZED VIEW >> - 0002: Add a new column relisivm to pg_class >> - 0003: Add new deptype option 'm' in pg_depend >> - 0004: Change trigger.c to allow to prolong life span of tupestores >> containing Transition Tables generated via AFTER trigger >> - 0005: Add IVM supprot for pg_dump >> - 0006: Add IVM support for psql >> - 0007: Add the basic IVM future: >> This supports inner joins, DISTINCT, and tuple duplicates. >> - 0008: Add aggregates (count, sum, avg, min, max) support for IVM >> - 0009: Add regression tests for IVM >> - 0010: Add documentation for IVM >> >> We could split the patch furthermore if this would make reviews much >> easer. >> For example, I think 0007 could be split into the more basic part and the >> part >> for handling tuple duplicates. Moreover, 0008 could be split into >> "min/max" >> and other aggregates because handling min/max is a bit more complicated >> than >> others. >> >> I also attached IVM_extra.tar.gz that contains patches for sub-quereis, >> outer-join, CTE support, just for your information. >> >> Regards, >> Yugo Nagata >> >> -- >> Yugo NAGATA >> >> >> -- >> Yugo NAGATA >> > > Hi, > For v23-0007-Add-Incremental-View-Maintenance-support.patch : > > bq. In this implementation, AFTER triggers are used to collecting > tuplestores > > 'to collecting' -> to collect > > bq. are contained in a old transition table. > > 'a old' -> an old > > bq. updates of more than one base tables > > one base tables -> one base table > > bq. DISTINCT and tuple duplicates in views are supported > > Since distinct and duplicate have opposite meanings, it would be better to > rephrase the above sentence. > > bq. The value in__ivm_count__ is updated > > I searched the patch for in__ivm_count__ - there was no (second) match. I > think there should be a space between in and underscore. > > +static void CreateIvmTriggersOnBaseTables_recurse(Query *qry, Node *node, > Oid matviewOid, Relids *relids, bool ex_lock); > > nit: long line. please wrap. > > + if (rewritten->distinctClause) > + rewritten->groupClause = transformDistinctClause(NULL, > >targetList, rewritten->sortClause, false); > + > + /* Add count(*) for counting distinct tuples in views */ > + if (rewritten->distinctClause) > > It seems the body of the two if statements can be combined into one. > > More to follow for this patch. > > Cheers > Hi, + CreateIvmTriggersOnBaseTables_recurse(qry, (Node *)qry, matviewOid, , ex_lock); Looking at existing recursive functions, e.g. src/backend/executor/execPartition.c:find_matching_subplans_recurse(PartitionPruningData *prunedata, the letters in the function name are all lower case. I think following the convention would
Re: Implementing Incremental View Maintenance
On Sun, Aug 1, 2021 at 11:30 PM Yugo NAGATA wrote: > Hi hackers, > > On Mon, 19 Jul 2021 09:24:30 +0900 > Yugo NAGATA wrote: > > > On Wed, 14 Jul 2021 21:22:37 +0530 > > vignesh C wrote: > > > > The patch does not apply on Head anymore, could you rebase and post a > > > patch. I'm changing the status to "Waiting for Author". > > > > Ok. I'll update the patch in a few days. > > Attached is the latest patch set to add support for Incremental > Materialized View Maintenance (IVM) > > The patches are rebased to the master and also revised with some > code cleaning. > > IVM is a way to make materialized views up-to-date in which only > incremental changes are computed and applied on views rather than > recomputing the contents from scratch as REFRESH MATERIALIZED VIEW > does. IVM can update materialized views more efficiently > than recomputation when only small part of the view need updates. > > The patch set implements a feature so that materialized views could be > updated automatically and immediately when a base table is modified. > > Currently, our IVM implementation supports views which could contain > tuple duplicates whose definition includes: > > - inner and outer joins including self-join > - DISTINCT > - some built-in aggregate functions (count, sum, agv, min, and max) > - a part of subqueries >-- simple subqueries in FROM clause >-- EXISTS subqueries in WHERE clause > - CTEs > > We hope the IVM feature would be adopted into pg15. However, the size of > patch set has grown too large through supporting above features. > Therefore, > I think it is better to consider only a part of these features for the > first > release. Especially, I would like propose the following features for pg15. > > - inner joins including self-join > - DISTINCT and views with tuple duplicates > - some built-in aggregate functions (count, sum, agv, min, and max) > > By omitting outer-join, sub-queries, and CTE features, the patch size > becomes > less than half. I hope this will make a bit easer to review the IVM patch > set. > > Here is a list of separated patches. > > - 0001: Add a new syntax: > CREATE INCREMENTAL MATERIALIZED VIEW > - 0002: Add a new column relisivm to pg_class > - 0003: Add new deptype option 'm' in pg_depend > - 0004: Change trigger.c to allow to prolong life span of tupestores > containing Transition Tables generated via AFTER trigger > - 0005: Add IVM supprot for pg_dump > - 0006: Add IVM support for psql > - 0007: Add the basic IVM future: > This supports inner joins, DISTINCT, and tuple duplicates. > - 0008: Add aggregates (count, sum, avg, min, max) support for IVM > - 0009: Add regression tests for IVM > - 0010: Add documentation for IVM > > We could split the patch furthermore if this would make reviews much > easer. > For example, I think 0007 could be split into the more basic part and the > part > for handling tuple duplicates. Moreover, 0008 could be split into "min/max" > and other aggregates because handling min/max is a bit more complicated > than > others. > > I also attached IVM_extra.tar.gz that contains patches for sub-quereis, > outer-join, CTE support, just for your information. > > Regards, > Yugo Nagata > > -- > Yugo NAGATA > > > -- > Yugo NAGATA > Hi, For v23-0007-Add-Incremental-View-Maintenance-support.patch : bq. In this implementation, AFTER triggers are used to collecting tuplestores 'to collecting' -> to collect bq. are contained in a old transition table. 'a old' -> an old bq. updates of more than one base tables one base tables -> one base table bq. DISTINCT and tuple duplicates in views are supported Since distinct and duplicate have opposite meanings, it would be better to rephrase the above sentence. bq. The value in__ivm_count__ is updated I searched the patch for in__ivm_count__ - there was no (second) match. I think there should be a space between in and underscore. +static void CreateIvmTriggersOnBaseTables_recurse(Query *qry, Node *node, Oid matviewOid, Relids *relids, bool ex_lock); nit: long line. please wrap. + if (rewritten->distinctClause) + rewritten->groupClause = transformDistinctClause(NULL, >targetList, rewritten->sortClause, false); + + /* Add count(*) for counting distinct tuples in views */ + if (rewritten->distinctClause) It seems the body of the two if statements can be combined into one. More to follow for this patch. Cheers
RE: Implementing Incremental View Maintenance
Hi Nagata-san, Thank you for your reply. > I'll investigate this more, but we may have to prohibit views on partitioned > table and partitions. I think this restriction is strict. This feature is useful when the base table is large and partitioning is also useful in such case. I have several additional comments on the patch. (1) The following features are added to transition table. - Prolong lifespan of transition table - If table has row security policies, set them to the transition table - Calculate pre-state of the table Are these features only for IVM? If there are other useful case, they should be separated from IVM patch and should be independent patch for transition table. (2) DEPENDENCY_IMMV (m) is added to deptype of pg_depend. What is the difference compared with existing deptype such as DEPENDENCY_INTERNAL (i)? (3) Converting from normal materialized view to IVM or from IVM to normal materialized view is not implemented yet. Is it difficult? I think create/drop triggers and __ivm_ columns can achieve this feature. Regards, Ryohei Takahashi
Re: Implementing Incremental View Maintenance
Hello Takahashi-san, On Tue, 3 Aug 2021 10:15:42 + "r.takahash...@fujitsu.com" wrote: > Hi Nagata-san, > > > I am interested in this patch since it is good feature. > > I run some simple tests. > I found the following problems. Thank you for your interest for this patch! > (1) > Failed to "make world". > I think there are extra "" in > doc/src/sgml/ref/create_materialized_view.sgml > (line 110 and 117) Oops. I'll fix it. > (2) > In the case of partition, it seems that IVM does not work well. > I run as follows. > > postgres=# create table parent (c int) partition by range (c); > CREATE TABLE > postgres=# create table child partition of parent for values from (1) to > (100); > CREATE TABLE > postgres=# create incremental materialized view ivm_parent as select c from > parent; > NOTICE: could not create an index on materialized view "ivm_parent" > automatically > HINT: Create an index on the materialized view for efficient incremental > maintenance. > SELECT 0 > postgres=# create incremental materialized view ivm_child as select c from > child; > NOTICE: could not create an index on materialized view "ivm_child" > automatically > HINT: Create an index on the materialized view for efficient incremental > maintenance. > SELECT 0 > postgres=# insert into parent values (1); > INSERT 0 1 > postgres=# insert into child values (2); > INSERT 0 1 > postgres=# select * from parent; > c > --- > 1 > 2 > (2 rows) > > postgres=# select * from child; > c > --- > 1 > 2 > (2 rows) > > postgres=# select * from ivm_parent; > c > --- > 1 > (1 row) > > postgres=# select * from ivm_child; > c > --- > 2 > (1 row) > > > I think ivm_parent and ivm_child should return 2 rows. Good point! I'll investigate this more, but we may have to prohibit views on partitioned table and partitions. > (3) > I think IVM does not support foreign table, but try to make IVM. > > postgres=# create incremental materialized view ivm_foreign as select c from > foreign_table; > NOTICE: could not create an index on materialized view "ivm_foreign" > automatically > HINT: Create an index on the materialized view for efficient incremental > maintenance. > ERROR: "foreign_table" is a foreign table > DETAIL: Triggers on foreign tables cannot have transition tables. > > It finally failed to make IVM, but I think it should be checked more early. You are right. We don't support foreign tables as long as we use triggers. I'll fix. Regards, Yugo Nagata -- Yugo NAGATA
Re: Implementing Incremental View Maintenance
Hello Zhihong Yu, On Mon, 2 Aug 2021 14:33:46 -0700 Zhihong Yu wrote: > On Sun, Aug 1, 2021 at 11:30 PM Yugo NAGATA wrote: > > > Hi hackers, > > > > On Mon, 19 Jul 2021 09:24:30 +0900 > > Yugo NAGATA wrote: > > > > > On Wed, 14 Jul 2021 21:22:37 +0530 > > > vignesh C wrote: > > > > > > The patch does not apply on Head anymore, could you rebase and post a > > > > patch. I'm changing the status to "Waiting for Author". > > > > > > Ok. I'll update the patch in a few days. > > > > Attached is the latest patch set to add support for Incremental > > Materialized View Maintenance (IVM) > > > > The patches are rebased to the master and also revised with some > > code cleaning. > > > > IVM is a way to make materialized views up-to-date in which only > > incremental changes are computed and applied on views rather than > > recomputing the contents from scratch as REFRESH MATERIALIZED VIEW > > does. IVM can update materialized views more efficiently > > than recomputation when only small part of the view need updates. > > > > The patch set implements a feature so that materialized views could be > > updated automatically and immediately when a base table is modified. > > > > Currently, our IVM implementation supports views which could contain > > tuple duplicates whose definition includes: > > > > - inner and outer joins including self-join > > - DISTINCT > > - some built-in aggregate functions (count, sum, agv, min, and max) > > - a part of subqueries > >-- simple subqueries in FROM clause > >-- EXISTS subqueries in WHERE clause > > - CTEs > > > > We hope the IVM feature would be adopted into pg15. However, the size of > > patch set has grown too large through supporting above features. > > Therefore, > > I think it is better to consider only a part of these features for the > > first > > release. Especially, I would like propose the following features for pg15. > > > > - inner joins including self-join > > - DISTINCT and views with tuple duplicates > > - some built-in aggregate functions (count, sum, agv, min, and max) > > > > By omitting outer-join, sub-queries, and CTE features, the patch size > > becomes > > less than half. I hope this will make a bit easer to review the IVM patch > > set. > > > > Here is a list of separated patches. > > > > - 0001: Add a new syntax: > > CREATE INCREMENTAL MATERIALIZED VIEW > > - 0002: Add a new column relisivm to pg_class > > - 0003: Add new deptype option 'm' in pg_depend > > - 0004: Change trigger.c to allow to prolong life span of tupestores > > containing Transition Tables generated via AFTER trigger > > - 0005: Add IVM supprot for pg_dump > > - 0006: Add IVM support for psql > > - 0007: Add the basic IVM future: > > This supports inner joins, DISTINCT, and tuple duplicates. > > - 0008: Add aggregates (count, sum, avg, min, max) support for IVM > > - 0009: Add regression tests for IVM > > - 0010: Add documentation for IVM > > > > We could split the patch furthermore if this would make reviews much > > easer. > > For example, I think 0007 could be split into the more basic part and the > > part > > for handling tuple duplicates. Moreover, 0008 could be split into "min/max" > > and other aggregates because handling min/max is a bit more complicated > > than > > others. > > > > I also attached IVM_extra.tar.gz that contains patches for sub-quereis, > > outer-join, CTE support, just for your information. > > > > Regards, > > Yugo Nagata > > > > -- > > Yugo NAGATA > > > > > > -- > > Yugo NAGATA > > > Hi, > For v23-0008-Add-aggregates-support-in-IVM.patch : Thank you for looking into this! > As a restriction, expressions specified in GROUP BY must appear in > the target list because tuples to be updated in IMMV are identified > by using this group keys. > > IMMV -> IMVM (Incremental Materialized View Maintenance, as said above) > Or maybe it means 'incrementally maintainable materialized view'. It would > be better to use the same abbreviation. IMMV is correct in the commit message of this patch. Rather, IMVM used in v23-0003-Add-new-deptype-option-m-in-pg_depend-system-cat.patch should be corrected to IMMV. > this group keys -> this group key > > +errmsg("GROUP BY expression not appeared in select > list is not supported on incrementally maintainable materialized view"))); > > expression not appeared in select list -> expression not appearing in > select list > > +* For aggregate functions except to count > > except to count -> except count Thank you for pointing out them. I'll fix. Regards, Yugo Nagata -- Yugo NAGATA
RE: Implementing Incremental View Maintenance
Hi Nagata-san, I am interested in this patch since it is good feature. I run some simple tests. I found the following problems. (1) Failed to "make world". I think there are extra "" in doc/src/sgml/ref/create_materialized_view.sgml (line 110 and 117) (2) In the case of partition, it seems that IVM does not work well. I run as follows. postgres=# create table parent (c int) partition by range (c); CREATE TABLE postgres=# create table child partition of parent for values from (1) to (100); CREATE TABLE postgres=# create incremental materialized view ivm_parent as select c from parent; NOTICE: could not create an index on materialized view "ivm_parent" automatically HINT: Create an index on the materialized view for efficient incremental maintenance. SELECT 0 postgres=# create incremental materialized view ivm_child as select c from child; NOTICE: could not create an index on materialized view "ivm_child" automatically HINT: Create an index on the materialized view for efficient incremental maintenance. SELECT 0 postgres=# insert into parent values (1); INSERT 0 1 postgres=# insert into child values (2); INSERT 0 1 postgres=# select * from parent; c --- 1 2 (2 rows) postgres=# select * from child; c --- 1 2 (2 rows) postgres=# select * from ivm_parent; c --- 1 (1 row) postgres=# select * from ivm_child; c --- 2 (1 row) I think ivm_parent and ivm_child should return 2 rows. (3) I think IVM does not support foreign table, but try to make IVM. postgres=# create incremental materialized view ivm_foreign as select c from foreign_table; NOTICE: could not create an index on materialized view "ivm_foreign" automatically HINT: Create an index on the materialized view for efficient incremental maintenance. ERROR: "foreign_table" is a foreign table DETAIL: Triggers on foreign tables cannot have transition tables. It finally failed to make IVM, but I think it should be checked more early. Regards, Ryohei Takahashi
Re: Implementing Incremental View Maintenance
On Sun, Aug 1, 2021 at 11:30 PM Yugo NAGATA wrote: > Hi hackers, > > On Mon, 19 Jul 2021 09:24:30 +0900 > Yugo NAGATA wrote: > > > On Wed, 14 Jul 2021 21:22:37 +0530 > > vignesh C wrote: > > > > The patch does not apply on Head anymore, could you rebase and post a > > > patch. I'm changing the status to "Waiting for Author". > > > > Ok. I'll update the patch in a few days. > > Attached is the latest patch set to add support for Incremental > Materialized View Maintenance (IVM) > > The patches are rebased to the master and also revised with some > code cleaning. > > IVM is a way to make materialized views up-to-date in which only > incremental changes are computed and applied on views rather than > recomputing the contents from scratch as REFRESH MATERIALIZED VIEW > does. IVM can update materialized views more efficiently > than recomputation when only small part of the view need updates. > > The patch set implements a feature so that materialized views could be > updated automatically and immediately when a base table is modified. > > Currently, our IVM implementation supports views which could contain > tuple duplicates whose definition includes: > > - inner and outer joins including self-join > - DISTINCT > - some built-in aggregate functions (count, sum, agv, min, and max) > - a part of subqueries >-- simple subqueries in FROM clause >-- EXISTS subqueries in WHERE clause > - CTEs > > We hope the IVM feature would be adopted into pg15. However, the size of > patch set has grown too large through supporting above features. > Therefore, > I think it is better to consider only a part of these features for the > first > release. Especially, I would like propose the following features for pg15. > > - inner joins including self-join > - DISTINCT and views with tuple duplicates > - some built-in aggregate functions (count, sum, agv, min, and max) > > By omitting outer-join, sub-queries, and CTE features, the patch size > becomes > less than half. I hope this will make a bit easer to review the IVM patch > set. > > Here is a list of separated patches. > > - 0001: Add a new syntax: > CREATE INCREMENTAL MATERIALIZED VIEW > - 0002: Add a new column relisivm to pg_class > - 0003: Add new deptype option 'm' in pg_depend > - 0004: Change trigger.c to allow to prolong life span of tupestores > containing Transition Tables generated via AFTER trigger > - 0005: Add IVM supprot for pg_dump > - 0006: Add IVM support for psql > - 0007: Add the basic IVM future: > This supports inner joins, DISTINCT, and tuple duplicates. > - 0008: Add aggregates (count, sum, avg, min, max) support for IVM > - 0009: Add regression tests for IVM > - 0010: Add documentation for IVM > > We could split the patch furthermore if this would make reviews much > easer. > For example, I think 0007 could be split into the more basic part and the > part > for handling tuple duplicates. Moreover, 0008 could be split into "min/max" > and other aggregates because handling min/max is a bit more complicated > than > others. > > I also attached IVM_extra.tar.gz that contains patches for sub-quereis, > outer-join, CTE support, just for your information. > > Regards, > Yugo Nagata > > -- > Yugo NAGATA > > > -- > Yugo NAGATA > Hi, For v23-0008-Add-aggregates-support-in-IVM.patch : As a restriction, expressions specified in GROUP BY must appear in the target list because tuples to be updated in IMMV are identified by using this group keys. IMMV -> IMVM (Incremental Materialized View Maintenance, as said above) Or maybe it means 'incrementally maintainable materialized view'. It would be better to use the same abbreviation. this group keys -> this group key +errmsg("GROUP BY expression not appeared in select list is not supported on incrementally maintainable materialized view"))); expression not appeared in select list -> expression not appearing in select list +* For aggregate functions except to count except to count -> except count Cheers
Re: Implementing Incremental View Maintenance
On Wed, 14 Jul 2021 21:22:37 +0530 vignesh C wrote: > On Mon, May 17, 2021 at 10:08 AM Yugo NAGATA wrote: > > > > On Fri, 7 May 2021 14:14:16 +0900 > > Yugo NAGATA wrote: > > > > > On Mon, 26 Apr 2021 16:03:48 +0900 > > > Yugo NAGATA wrote: > > > > > > > On Mon, 26 Apr 2021 15:46:21 +0900 > > > > Yugo NAGATA wrote: > > > > > > > > > On Tue, 20 Apr 2021 09:51:34 +0900 > > > > > Yugo NAGATA wrote: > > > > > > > > > > > On Mon, 19 Apr 2021 17:40:31 -0400 > > > > > > Tom Lane wrote: > > > > > > > > > > > > > Andrew Dunstan writes: > > > > > > > > This patch (v22c) just crashed for me with an assertion failure > > > > > > > > on > > > > > > > > Fedora 31. Here's the stack trace: > > > > > > > > > > > > > > > #2 0x0094a54a in ExceptionalCondition > > > > > > > > (conditionName=conditionName@entry=0xa91dae > > > > > > > > "queryDesc->sourceText != > > > > > > > > NULL", errorType=errorType@entry=0x99b468 "FailedAssertion", > > > > > > > > fileName=fileName@entry=0xa91468 > > > > > > > > "/home/andrew/pgl/pg_head/src/backend/executor/execMain.c", > > > > > > > > lineNumber=lineNumber@entry=199) at > > > > > > > > /home/andrew/pgl/pg_head/src/backend/utils/error/assert.c:69 > > > > > > > > > > > > > > That assert just got added a few days ago, so that's why the patch > > > > > > > seemed OK before. > > > > > > > > > > > > Thank you for letting me know. I'll fix it. > > > > > > > > > > Attached is the fixed patch. > > > > > > > > > > queryDesc->sourceText cannot be NULL after commit b2668d8, > > > > > so now we pass an empty string "" for refresh_matview_datafill() > > > > > instead NULL > > > > > when maintaining views incrementally. > > > > > > > > I am sorry, I forgot to include a fix for 8aba9322511. > > > > Attached is the fixed version. > > > > > > Attached is the rebased patch (for 6b8d29419d). > > > > I attached a rebased patch. > > The patch does not apply on Head anymore, could you rebase and post a > patch. I'm changing the status to "Waiting for Author". Ok. I'll update the patch in a few days. Regards, Yugo Nagata -- Yugo NAGATA
Re: Implementing Incremental View Maintenance
On Mon, May 17, 2021 at 10:08 AM Yugo NAGATA wrote: > > On Fri, 7 May 2021 14:14:16 +0900 > Yugo NAGATA wrote: > > > On Mon, 26 Apr 2021 16:03:48 +0900 > > Yugo NAGATA wrote: > > > > > On Mon, 26 Apr 2021 15:46:21 +0900 > > > Yugo NAGATA wrote: > > > > > > > On Tue, 20 Apr 2021 09:51:34 +0900 > > > > Yugo NAGATA wrote: > > > > > > > > > On Mon, 19 Apr 2021 17:40:31 -0400 > > > > > Tom Lane wrote: > > > > > > > > > > > Andrew Dunstan writes: > > > > > > > This patch (v22c) just crashed for me with an assertion failure on > > > > > > > Fedora 31. Here's the stack trace: > > > > > > > > > > > > > #2 0x0094a54a in ExceptionalCondition > > > > > > > (conditionName=conditionName@entry=0xa91dae > > > > > > > "queryDesc->sourceText != > > > > > > > NULL", errorType=errorType@entry=0x99b468 "FailedAssertion", > > > > > > > fileName=fileName@entry=0xa91468 > > > > > > > "/home/andrew/pgl/pg_head/src/backend/executor/execMain.c", > > > > > > > lineNumber=lineNumber@entry=199) at > > > > > > > /home/andrew/pgl/pg_head/src/backend/utils/error/assert.c:69 > > > > > > > > > > > > That assert just got added a few days ago, so that's why the patch > > > > > > seemed OK before. > > > > > > > > > > Thank you for letting me know. I'll fix it. > > > > > > > > Attached is the fixed patch. > > > > > > > > queryDesc->sourceText cannot be NULL after commit b2668d8, > > > > so now we pass an empty string "" for refresh_matview_datafill() > > > > instead NULL > > > > when maintaining views incrementally. > > > > > > I am sorry, I forgot to include a fix for 8aba9322511. > > > Attached is the fixed version. > > > > Attached is the rebased patch (for 6b8d29419d). > > I attached a rebased patch. The patch does not apply on Head anymore, could you rebase and post a patch. I'm changing the status to "Waiting for Author". Regards, Vignesh
Re: Implementing Incremental View Maintenance
On Fri, 7 May 2021 14:14:16 +0900 Yugo NAGATA wrote: > On Mon, 26 Apr 2021 16:03:48 +0900 > Yugo NAGATA wrote: > > > On Mon, 26 Apr 2021 15:46:21 +0900 > > Yugo NAGATA wrote: > > > > > On Tue, 20 Apr 2021 09:51:34 +0900 > > > Yugo NAGATA wrote: > > > > > > > On Mon, 19 Apr 2021 17:40:31 -0400 > > > > Tom Lane wrote: > > > > > > > > > Andrew Dunstan writes: > > > > > > This patch (v22c) just crashed for me with an assertion failure on > > > > > > Fedora 31. Here's the stack trace: > > > > > > > > > > > #2 0x0094a54a in ExceptionalCondition > > > > > > (conditionName=conditionName@entry=0xa91dae "queryDesc->sourceText > > > > > > != > > > > > > NULL", errorType=errorType@entry=0x99b468 "FailedAssertion", > > > > > > fileName=fileName@entry=0xa91468 > > > > > > "/home/andrew/pgl/pg_head/src/backend/executor/execMain.c", > > > > > > lineNumber=lineNumber@entry=199) at > > > > > > /home/andrew/pgl/pg_head/src/backend/utils/error/assert.c:69 > > > > > > > > > > That assert just got added a few days ago, so that's why the patch > > > > > seemed OK before. > > > > > > > > Thank you for letting me know. I'll fix it. > > > > > > Attached is the fixed patch. > > > > > > queryDesc->sourceText cannot be NULL after commit b2668d8, > > > so now we pass an empty string "" for refresh_matview_datafill() instead > > > NULL > > > when maintaining views incrementally. > > > > I am sorry, I forgot to include a fix for 8aba9322511. > > Attached is the fixed version. > > Attached is the rebased patch (for 6b8d29419d). I attached a rebased patch. -- Yugo NAGATA IVM_patches_v22g.tar.gz Description: application/gzip
Re: Implementing Incremental View Maintenance
On Mon, 26 Apr 2021 16:03:48 +0900 Yugo NAGATA wrote: > On Mon, 26 Apr 2021 15:46:21 +0900 > Yugo NAGATA wrote: > > > On Tue, 20 Apr 2021 09:51:34 +0900 > > Yugo NAGATA wrote: > > > > > On Mon, 19 Apr 2021 17:40:31 -0400 > > > Tom Lane wrote: > > > > > > > Andrew Dunstan writes: > > > > > This patch (v22c) just crashed for me with an assertion failure on > > > > > Fedora 31. Here's the stack trace: > > > > > > > > > #2 0x0094a54a in ExceptionalCondition > > > > > (conditionName=conditionName@entry=0xa91dae "queryDesc->sourceText != > > > > > NULL", errorType=errorType@entry=0x99b468 "FailedAssertion", > > > > > fileName=fileName@entry=0xa91468 > > > > > "/home/andrew/pgl/pg_head/src/backend/executor/execMain.c", > > > > > lineNumber=lineNumber@entry=199) at > > > > > /home/andrew/pgl/pg_head/src/backend/utils/error/assert.c:69 > > > > > > > > That assert just got added a few days ago, so that's why the patch > > > > seemed OK before. > > > > > > Thank you for letting me know. I'll fix it. > > > > Attached is the fixed patch. > > > > queryDesc->sourceText cannot be NULL after commit b2668d8, > > so now we pass an empty string "" for refresh_matview_datafill() instead > > NULL > > when maintaining views incrementally. > > I am sorry, I forgot to include a fix for 8aba9322511. > Attached is the fixed version. Attached is the rebased patch (for 6b8d29419d). -- Yugo NAGATA IVM_patches_v22f.tar.gz Description: application/gzip
Re: Implementing Incremental View Maintenance
On Mon, 26 Apr 2021 15:46:21 +0900 Yugo NAGATA wrote: > On Tue, 20 Apr 2021 09:51:34 +0900 > Yugo NAGATA wrote: > > > On Mon, 19 Apr 2021 17:40:31 -0400 > > Tom Lane wrote: > > > > > Andrew Dunstan writes: > > > > This patch (v22c) just crashed for me with an assertion failure on > > > > Fedora 31. Here's the stack trace: > > > > > > > #2 0x0094a54a in ExceptionalCondition > > > > (conditionName=conditionName@entry=0xa91dae "queryDesc->sourceText != > > > > NULL", errorType=errorType@entry=0x99b468 "FailedAssertion", > > > > fileName=fileName@entry=0xa91468 > > > > "/home/andrew/pgl/pg_head/src/backend/executor/execMain.c", > > > > lineNumber=lineNumber@entry=199) at > > > > /home/andrew/pgl/pg_head/src/backend/utils/error/assert.c:69 > > > > > > That assert just got added a few days ago, so that's why the patch > > > seemed OK before. > > > > Thank you for letting me know. I'll fix it. > > Attached is the fixed patch. > > queryDesc->sourceText cannot be NULL after commit b2668d8, > so now we pass an empty string "" for refresh_matview_datafill() instead NULL > when maintaining views incrementally. I am sorry, I forgot to include a fix for 8aba9322511. Attached is the fixed version. Regards, Yugo Nagata -- Yugo NAGATA IVM_patches_v22e.tar.gz Description: application/gzip
Re: Implementing Incremental View Maintenance
On Tue, 20 Apr 2021 09:51:34 +0900 Yugo NAGATA wrote: > On Mon, 19 Apr 2021 17:40:31 -0400 > Tom Lane wrote: > > > Andrew Dunstan writes: > > > This patch (v22c) just crashed for me with an assertion failure on > > > Fedora 31. Here's the stack trace: > > > > > #2 0x0094a54a in ExceptionalCondition > > > (conditionName=conditionName@entry=0xa91dae "queryDesc->sourceText != > > > NULL", errorType=errorType@entry=0x99b468 "FailedAssertion", > > > fileName=fileName@entry=0xa91468 > > > "/home/andrew/pgl/pg_head/src/backend/executor/execMain.c", > > > lineNumber=lineNumber@entry=199) at > > > /home/andrew/pgl/pg_head/src/backend/utils/error/assert.c:69 > > > > That assert just got added a few days ago, so that's why the patch > > seemed OK before. > > Thank you for letting me know. I'll fix it. Attached is the fixed patch. queryDesc->sourceText cannot be NULL after commit b2668d8, so now we pass an empty string "" for refresh_matview_datafill() instead NULL when maintaining views incrementally. Regards, Yugo Nagata -- Yugo NAGATA IVM_patches_v22d.tar.gz Description: application/gzip
Re: Implementing Incremental View Maintenance
On Mon, 19 Apr 2021 17:40:31 -0400 Tom Lane wrote: > Andrew Dunstan writes: > > This patch (v22c) just crashed for me with an assertion failure on > > Fedora 31. Here's the stack trace: > > > #2 0x0094a54a in ExceptionalCondition > > (conditionName=conditionName@entry=0xa91dae "queryDesc->sourceText != > > NULL", errorType=errorType@entry=0x99b468 "FailedAssertion", > > fileName=fileName@entry=0xa91468 > > "/home/andrew/pgl/pg_head/src/backend/executor/execMain.c", > > lineNumber=lineNumber@entry=199) at > > /home/andrew/pgl/pg_head/src/backend/utils/error/assert.c:69 > > That assert just got added a few days ago, so that's why the patch > seemed OK before. Thank you for letting me know. I'll fix it. -- Yugo NAGATA
Re: Implementing Incremental View Maintenance
Andrew Dunstan writes: > This patch (v22c) just crashed for me with an assertion failure on > Fedora 31. Here's the stack trace: > #2 0x0094a54a in ExceptionalCondition > (conditionName=conditionName@entry=0xa91dae "queryDesc->sourceText != > NULL", errorType=errorType@entry=0x99b468 "FailedAssertion", > fileName=fileName@entry=0xa91468 > "/home/andrew/pgl/pg_head/src/backend/executor/execMain.c", > lineNumber=lineNumber@entry=199) at > /home/andrew/pgl/pg_head/src/backend/utils/error/assert.c:69 That assert just got added a few days ago, so that's why the patch seemed OK before. regards, tom lane
Re: Implementing Incremental View Maintenance
On 4/7/21 5:25 AM, Yugo NAGATA wrote: > Hi, > > I rebased the patch because the cfbot failed. > > Regards, > Yugo Nagata This patch (v22c) just crashed for me with an assertion failure on Fedora 31. Here's the stack trace: [New LWP 333090] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib64/libthread_db.so.1". Core was generated by `postgres: andrew regression [local] INSERT '. Program terminated with signal SIGABRT, Aborted. #0 0x7f8981caa625 in raise () from /lib64/libc.so.6 #0 0x7f8981caa625 in raise () from /lib64/libc.so.6 #1 0x7f8981c938d9 in abort () from /lib64/libc.so.6 #2 0x0094a54a in ExceptionalCondition (conditionName=conditionName@entry=0xa91dae "queryDesc->sourceText != NULL", errorType=errorType@entry=0x99b468 "FailedAssertion", fileName=fileName@entry=0xa91468 "/home/andrew/pgl/pg_head/src/backend/executor/execMain.c", lineNumber=lineNumber@entry=199) at /home/andrew/pgl/pg_head/src/backend/utils/error/assert.c:69 #3 0x006c0e17 in standard_ExecutorStart (queryDesc=0x226af98, eflags=0) at /home/andrew/pgl/pg_head/src/backend/executor/execMain.c:199 #4 0x006737b2 in refresh_matview_datafill (dest=0x21cf428, query=, queryEnv=0x2245fd0, resultTupleDesc=0x7ffd5e764888, queryString=0x0) at /home/andrew/pgl/pg_head/src/backend/commands/matview.c:719 #5 0x00678042 in calc_delta (queryEnv=0x2245fd0, tupdesc_new=0x7ffd5e764888, tupdesc_old=0x7ffd5e764880, dest_new=0x21cf428, dest_old=0x0, query=0x2246108, rte_path=0x2228a60, table=) at /home/andrew/pgl/pg_head/src/backend/commands/matview.c:2907 #6 IVM_immediate_maintenance (fcinfo=) at /home/andrew/pgl/pg_head/src/backend/commands/matview.c:1683 #7 0x0069e483 in ExecCallTriggerFunc (trigdata=0x7ffd5e764bb0, tgindx=2, finfo=0x22345f8, instr=0x0, per_tuple_context=0x2245eb0) at /home/andrew/pgl/pg_head/src/backend/commands/trigger.c:2142 #8 0x0069fc4c in AfterTriggerExecute (trigdesc=0x2233db8, trigdesc=0x2233db8, trig_tuple_slot2=0x0, trig_tuple_slot1=0x0, per_tuple_context=0x2245eb0, instr=0x0, finfo=0x2234598, relInfo=0x2233ba0, event=0x222d380, estate=0x2233710) at /home/andrew/pgl/pg_head/src/backend/commands/trigger.c:4041 #9 afterTriggerInvokeEvents (events=0x21cece8, firing_id=1, estate=0x2233710, delete_ok=false) at /home/andrew/pgl/pg_head/src/backend/commands/trigger.c:4255 #10 0x006a4173 in AfterTriggerEndQuery (estate=estate@entry=0x2233710) at /home/andrew/pgl/pg_head/src/backend/commands/trigger.c:4632 #11 0x006c04c8 in standard_ExecutorFinish (queryDesc=0x2237300) at /home/andrew/pgl/pg_head/src/backend/executor/execMain.c:436 #12 0x008415d8 in ProcessQuery (plan=, sourceText=0x21490a0 "INSERT INTO mv_base_b VALUES(5,105);", params=0x0, queryEnv=0x0, dest=0x2221010, qc=0x7ffd5e764f00) at /home/andrew/pgl/pg_head/src/backend/tcop/pquery.c:190 #13 0x008417f2 in PortalRunMulti (portal=portal@entry=0x21ac3c0, isTopLevel=isTopLevel@entry=true, setHoldSnapshot=setHoldSnapshot@entry=false, dest=dest@entry=0x2221010, altdest=altdest@entry=0x2221010, qc=qc@entry=0x7ffd5e764f00) at /home/andrew/pgl/pg_head/src/backend/tcop/pquery.c:1267 #14 0x00842415 in PortalRun (portal=portal@entry=0x21ac3c0, count=count@entry=9223372036854775807, isTopLevel=isTopLevel@entry=true, run_once=run_once@entry=true, dest=dest@entry=0x2221010, altdest=altdest@entry=0x2221010, qc=0x7ffd5e764f00) at /home/andrew/pgl/pg_head/src/backend/tcop/pquery.c:779 #15 0x0083e3ca in exec_simple_query (query_string=0x21490a0 "INSERT INTO mv_base_b VALUES(5,105);") at /home/andrew/pgl/pg_head/src/backend/tcop/postgres.c:1196 #16 0x00840075 in PostgresMain (argc=argc@entry=1, argv=argv@entry=0x7ffd5e765450, dbname=, username=) at /home/andrew/pgl/pg_head/src/backend/tcop/postgres.c:4458 #17 0x007b8054 in BackendRun (port=, port=) at /home/andrew/pgl/pg_head/src/backend/postmaster/postmaster.c:4488 #18 BackendStartup (port=) at /home/andrew/pgl/pg_head/src/backend/postmaster/postmaster.c:4210 #19 ServerLoop () at /home/andrew/pgl/pg_head/src/backend/postmaster/postmaster.c:1742 #20 0x007b8ebf in PostmasterMain (argc=argc@entry=8, argv=argv@entry=0x21435c0) at /home/andrew/pgl/pg_head/src/backend/postmaster/postmaster.c:1414 #21 0x0050e030 in main (argc=8, argv=0x21435c0) at /home/andrew/pgl/pg_head/src/backend/main/main.c:209 $1 = {si_signo = 6, si_errno = 0, si_code = -6, _sifields = {_pad = {333090, 500, 0 }, _kill = {si_pid = 333090, si_uid = 500}, _timer = {si_tid = 333090, si_overrun = 500, si_sigval = {sival_int = 0, sival_ptr = 0x0}}, _rt = {si_pid = 333090, si_uid = 500, si_sigval = {sival_int = 0, sival_ptr = 0x0}}, _sigchld = {si_pid = 333090, si_uid = 500, si_status = 0, si_utime = 0, si_stime = 0}, _sigfault = {si_addr = 0x1f400051522, _addr_lsb = 0, _addr_bnd = {_lower = 0x0, _upper = 0x0}}, _sigpoll = {si_band = 2147483981090, si_fd = 0}}}
Re: Implementing Incremental View Maintenance
Hi, I rebased the patch because the cfbot failed. Regards, Yugo Nagata On Tue, 9 Mar 2021 17:27:50 +0900 Yugo NAGATA wrote: > On Tue, 9 Mar 2021 09:20:49 +0900 > Yugo NAGATA wrote: > > > On Mon, 8 Mar 2021 15:42:00 -0500 > > Andrew Dunstan wrote: > > > > > > > > On 2/18/21 9:01 PM, Yugo NAGATA wrote: > > > > On Thu, 18 Feb 2021 19:38:44 +0800 > > > > Andy Fan wrote: > > > > > > > >> On Tue, Feb 16, 2021 at 9:33 AM Yugo NAGATA > > > >> wrote: > > > >> > > > >>> Hi, > > > >>> > > > >>> Attached is a rebased patch (v22a). > > > >>> > > > >> Thanks for the patch. Will you think posting a patch with the latest > > > >> commit > > > >> at that > > > >> time is helpful? If so, when others want to review it, they know which > > > >> commit to > > > >> apply the patch without asking for a new rebase usually. > > > > I rebased the patch because cfbot failed. > > > > http://cfbot.cputube.org/ > > > > > > > > > > It's bitrotted a bit more dues to commits bb437f995d and 25936fd46c > > > > Thank you for letting me konw. I'll rebase it soon. > > Done. Attached is a rebased patch set. > > Regards, > Yugo Nagata > > -- > Yugo NAGATA -- Yugo NAGATA IVM_patches_v22c.tar.gz Description: application/gzip
Re: Implementing Incremental View Maintenance
On Tue, 9 Mar 2021 09:20:49 +0900 Yugo NAGATA wrote: > On Mon, 8 Mar 2021 15:42:00 -0500 > Andrew Dunstan wrote: > > > > > On 2/18/21 9:01 PM, Yugo NAGATA wrote: > > > On Thu, 18 Feb 2021 19:38:44 +0800 > > > Andy Fan wrote: > > > > > >> On Tue, Feb 16, 2021 at 9:33 AM Yugo NAGATA wrote: > > >> > > >>> Hi, > > >>> > > >>> Attached is a rebased patch (v22a). > > >>> > > >> Thanks for the patch. Will you think posting a patch with the latest > > >> commit > > >> at that > > >> time is helpful? If so, when others want to review it, they know which > > >> commit to > > >> apply the patch without asking for a new rebase usually. > > > I rebased the patch because cfbot failed. > > > http://cfbot.cputube.org/ > > > > > > > It's bitrotted a bit more dues to commits bb437f995d and 25936fd46c > > Thank you for letting me konw. I'll rebase it soon. Done. Attached is a rebased patch set. Regards, Yugo Nagata -- Yugo NAGATA IVM_patches_v22b.tar.gz Description: application/gzip
RE: Implementing Incremental View Maintenance
From: Thomas Munro > It's probably time to move forward with the plan of pushing the > results into a commitfest.postgresql.org API, and then making Magnus > et al write the email spam code with a preferences screen linked to > your community account :-D +1 I wish to see all the patch status information on the CF app. Regards Takayuki Tsunakawa
Re: Implementing Incremental View Maintenance
On Tue, Mar 9, 2021 at 1:22 PM Yugo NAGATA wrote: > On Mon, 8 Mar 2021 15:42:00 -0500 > Andrew Dunstan wrote: > > (A useful feature of the cfbot might be to notify the authors and > > reviewers when it detects bitrot for a previously passing entry.) > > +1 > The feature notifying it authors seems to me nice. Nice idea. I was initially afraid of teaching cfbot to send email, for fear of creating an out of control spam machine. Probably the main thing would be the ability to interact with it to turn it on/off. It's probably time to move forward with the plan of pushing the results into a commitfest.postgresql.org API, and then making Magnus et al write the email spam code with a preferences screen linked to your community account :-D
Re: Implementing Incremental View Maintenance
On Mon, 8 Mar 2021 15:42:00 -0500 Andrew Dunstan wrote: > > On 2/18/21 9:01 PM, Yugo NAGATA wrote: > > On Thu, 18 Feb 2021 19:38:44 +0800 > > Andy Fan wrote: > > > >> On Tue, Feb 16, 2021 at 9:33 AM Yugo NAGATA wrote: > >> > >>> Hi, > >>> > >>> Attached is a rebased patch (v22a). > >>> > >> Thanks for the patch. Will you think posting a patch with the latest commit > >> at that > >> time is helpful? If so, when others want to review it, they know which > >> commit to > >> apply the patch without asking for a new rebase usually. > > I rebased the patch because cfbot failed. > > http://cfbot.cputube.org/ > > > > It's bitrotted a bit more dues to commits bb437f995d and 25936fd46c Thank you for letting me konw. I'll rebase it soon. > > > (A useful feature of the cfbot might be to notify the authors and > reviewers when it detects bitrot for a previously passing entry.) +1 The feature notifying it authors seems to me nice. Regards, Yugo Nagata -- Yugo NAGATA
Re: Implementing Incremental View Maintenance
On 2/18/21 9:01 PM, Yugo NAGATA wrote: > On Thu, 18 Feb 2021 19:38:44 +0800 > Andy Fan wrote: > >> On Tue, Feb 16, 2021 at 9:33 AM Yugo NAGATA wrote: >> >>> Hi, >>> >>> Attached is a rebased patch (v22a). >>> >> Thanks for the patch. Will you think posting a patch with the latest commit >> at that >> time is helpful? If so, when others want to review it, they know which >> commit to >> apply the patch without asking for a new rebase usually. > I rebased the patch because cfbot failed. > http://cfbot.cputube.org/ > It's bitrotted a bit more dues to commits bb437f995d and 25936fd46c (A useful feature of the cfbot might be to notify the authors and reviewers when it detects bitrot for a previously passing entry.) cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
Re: Implementing Incremental View Maintenance
On Thu, 18 Feb 2021 19:38:44 +0800 Andy Fan wrote: > On Tue, Feb 16, 2021 at 9:33 AM Yugo NAGATA wrote: > > > Hi, > > > > Attached is a rebased patch (v22a). > > > > Thanks for the patch. Will you think posting a patch with the latest commit > at that > time is helpful? If so, when others want to review it, they know which > commit to > apply the patch without asking for a new rebase usually. I rebased the patch because cfbot failed. http://cfbot.cputube.org/ Regards, Yugo Nagata -- Yugo NAGATA
Re: Implementing Incremental View Maintenance
On Tue, Feb 16, 2021 at 9:33 AM Yugo NAGATA wrote: > Hi, > > Attached is a rebased patch (v22a). > Thanks for the patch. Will you think posting a patch with the latest commit at that time is helpful? If so, when others want to review it, they know which commit to apply the patch without asking for a new rebase usually. -- Best Regards Andy Fan (https://www.aliyun.com/)
Re: Implementing Incremental View Maintenance
Hi, Attached is a rebased patch (v22a). Ragards, Yugo Nagata -- Yugo NAGATA IVM_patches_v22a.tar.gz Description: application/gzip
Re: Implementing Incremental View Maintenance
Hi, Attached is a revised patch (v22) rebased for the latest master head. Regards, Yugo Nagata -- Yugo NAGATA IVM_patches_v22.tar.gz Description: application/gzip
Re: Implementing Incremental View Maintenance
Hi, Attached is the revised patch (v21) to add support for Incremental Materialized View Maintenance (IVM). In addition to some typos in the previous enhancement, I fixed a check to prevent a view from containing an expression including aggregates like sum(x)/sum(y) in this revision. Regards, Yugo Nagata On Tue, 22 Dec 2020 22:24:22 +0900 Yugo NAGATA wrote: > Hi hackers, > > I heard the opinion that this patch is too big and hard to review. > So, I wander that we should downsize the patch by eliminating some > features and leaving other basic features. > > If there are more opinions this makes it easer for reviewers to look > at this patch, I would like do it. If so, we plan to support only > selection, projection, inner-join, and some aggregates in the first > release and leave sub-queries, outer-join, and CTE supports to the > next release. > > Regards, > Yugo Nagata > > On Tue, 22 Dec 2020 21:51:36 +0900 > Yugo NAGATA wrote: > > Hi, > > > > Attached is the revised patch (v20) to add support for Incremental > > Materialized View Maintenance (IVM). > > > > In according with Konstantin's suggestion, I made a few optimizations. > > > > 1. Creating an index on the matview automatically > > > > When creating incremental maintainable materialized view (IMMV)s, > > a unique index on IMMV is created automatically if possible. > > > > If the view definition query has a GROUP BY clause, the index is created > > on the columns of GROUP BY expressions. Otherwise, if the view contains > > all primary key attributes of its base tables in the target list, the index > > is created on these attributes. Also, if the view has DISTINCT, > > a unique index is created on all columns in the target list. > > In other cases, no index is created. > > > > In all cases, a NOTICE message is output to inform users that an index is > > created or that an appropriate index is necessary for efficient IVM. > > > > 2. Use a weaker lock on the matview if possible > > > > If the view has only one base table in this query, RowExclusiveLock is > > held on the view instead of AccessExclusiveLock, because we don't > > need to wait other concurrent transaction's result in order to > > maintain the view in this case. When the same row in the view is > > affected due to concurrent maintenances, a row level lock will > > protect it. > > > > On Tue, 24 Nov 2020 12:46:57 +0300 > > Konstantin Knizhnik wrote: > > > > > The most obvious optimization is not to use exclusive table lock if view > > > depends just on one table (contains no joins). > > > Looks like there are no any anomalies in this case, are there? > > > > I confirmed the effect of this optimizations. > > > > First, when I performed pgbench (SF=100) without any materialized views, > > the results is : > > > > pgbench test4 -T 300 -c 8 -j 4 > > latency average = 6.493 ms > > tps = 1232.146229 (including connections establishing) > > > > Next, created a view as below, I performed the same pgbench. > > CREATE INCREMENTAL MATERIALIZED VIEW mv_ivm2 AS > > SELECT bid, count(abalance), sum(abalance), avg(abalance) > > FROM pgbench_accounts GROUP BY bid; > > > > The result is here: > > > > [the previous version (v19 with exclusive table lock)] > > - latency average = 77.677 ms > > - tps = 102.990159 (including connections establishing) > > > > [In the latest version (v20 with weaker lock)] > > - latency average = 17.576 ms > > - tps = 455.159644 (including connections establishing) > > > > There is still substantial overhead, but we can see that the effect > > of the optimization. > > > > Regards, > > Yugo Nagata > > > > -- > > Yugo NAGATA > > > -- > Yugo NAGATA > > -- Yugo NAGATA IVM_patches_v21.tar.gz Description: application/gzip
Re: Implementing Incremental View Maintenance
Hi Yugo, > 1. Creating an index on the matview automatically Nice. > 2. Use a weaker lock on the matview if possible > > If the view has only one base table in this query, RowExclusiveLock is > held on the view instead of AccessExclusiveLock, because we don't > need to wait other concurrent transaction's result in order to > maintain the view in this case. When the same row in the view is > affected due to concurrent maintenances, a row level lock will > protect it. > > On Tue, 24 Nov 2020 12:46:57 +0300 > Konstantin Knizhnik wrote: > >> The most obvious optimization is not to use exclusive table lock if view >> depends just on one table (contains no joins). >> Looks like there are no any anomalies in this case, are there? > > I confirmed the effect of this optimizations. > > First, when I performed pgbench (SF=100) without any materialized views, > the results is : > > pgbench test4 -T 300 -c 8 -j 4 > latency average = 6.493 ms > tps = 1232.146229 (including connections establishing) > > Next, created a view as below, I performed the same pgbench. > CREATE INCREMENTAL MATERIALIZED VIEW mv_ivm2 AS > SELECT bid, count(abalance), sum(abalance), avg(abalance) > FROM pgbench_accounts GROUP BY bid; > > The result is here: > > [the previous version (v19 with exclusive table lock)] > - latency average = 77.677 ms > - tps = 102.990159 (including connections establishing) > > [In the latest version (v20 with weaker lock)] > - latency average = 17.576 ms > - tps = 455.159644 (including connections establishing) > > There is still substantial overhead, but we can see that the effect > of the optimization. Great improvement. Now with this patch more than 4x faster than previous one! Best regards, -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese:http://www.sraoss.co.jp
Re: Implementing Incremental View Maintenance
Hi hackers, I heard the opinion that this patch is too big and hard to review. So, I wander that we should downsize the patch by eliminating some features and leaving other basic features. If there are more opinions this makes it easer for reviewers to look at this patch, I would like do it. If so, we plan to support only selection, projection, inner-join, and some aggregates in the first release and leave sub-queries, outer-join, and CTE supports to the next release. Regards, Yugo Nagata On Tue, 22 Dec 2020 21:51:36 +0900 Yugo NAGATA wrote: > Hi, > > Attached is the revised patch (v20) to add support for Incremental > Materialized View Maintenance (IVM). > > In according with Konstantin's suggestion, I made a few optimizations. > > 1. Creating an index on the matview automatically > > When creating incremental maintainable materialized view (IMMV)s, > a unique index on IMMV is created automatically if possible. > > If the view definition query has a GROUP BY clause, the index is created > on the columns of GROUP BY expressions. Otherwise, if the view contains > all primary key attributes of its base tables in the target list, the index > is created on these attributes. Also, if the view has DISTINCT, > a unique index is created on all columns in the target list. > In other cases, no index is created. > > In all cases, a NOTICE message is output to inform users that an index is > created or that an appropriate index is necessary for efficient IVM. > > 2. Use a weaker lock on the matview if possible > > If the view has only one base table in this query, RowExclusiveLock is > held on the view instead of AccessExclusiveLock, because we don't > need to wait other concurrent transaction's result in order to > maintain the view in this case. When the same row in the view is > affected due to concurrent maintenances, a row level lock will > protect it. > > On Tue, 24 Nov 2020 12:46:57 +0300 > Konstantin Knizhnik wrote: > > > The most obvious optimization is not to use exclusive table lock if view > > depends just on one table (contains no joins). > > Looks like there are no any anomalies in this case, are there? > > I confirmed the effect of this optimizations. > > First, when I performed pgbench (SF=100) without any materialized views, > the results is : > > pgbench test4 -T 300 -c 8 -j 4 > latency average = 6.493 ms > tps = 1232.146229 (including connections establishing) > > Next, created a view as below, I performed the same pgbench. > CREATE INCREMENTAL MATERIALIZED VIEW mv_ivm2 AS > SELECT bid, count(abalance), sum(abalance), avg(abalance) > FROM pgbench_accounts GROUP BY bid; > > The result is here: > > [the previous version (v19 with exclusive table lock)] > - latency average = 77.677 ms > - tps = 102.990159 (including connections establishing) > > [In the latest version (v20 with weaker lock)] > - latency average = 17.576 ms > - tps = 455.159644 (including connections establishing) > > There is still substantial overhead, but we can see that the effect > of the optimization. > > Regards, > Yugo Nagata > > -- > Yugo NAGATA -- Yugo NAGATA
Re: Implementing Incremental View Maintenance
Hi, Attached is the revised patch (v20) to add support for Incremental Materialized View Maintenance (IVM). In according with Konstantin's suggestion, I made a few optimizations. 1. Creating an index on the matview automatically When creating incremental maintainable materialized view (IMMV)s, a unique index on IMMV is created automatically if possible. If the view definition query has a GROUP BY clause, the index is created on the columns of GROUP BY expressions. Otherwise, if the view contains all primary key attributes of its base tables in the target list, the index is created on these attributes. Also, if the view has DISTINCT, a unique index is created on all columns in the target list. In other cases, no index is created. In all cases, a NOTICE message is output to inform users that an index is created or that an appropriate index is necessary for efficient IVM. 2. Use a weaker lock on the matview if possible If the view has only one base table in this query, RowExclusiveLock is held on the view instead of AccessExclusiveLock, because we don't need to wait other concurrent transaction's result in order to maintain the view in this case. When the same row in the view is affected due to concurrent maintenances, a row level lock will protect it. On Tue, 24 Nov 2020 12:46:57 +0300 Konstantin Knizhnik wrote: > The most obvious optimization is not to use exclusive table lock if view > depends just on one table (contains no joins). > Looks like there are no any anomalies in this case, are there? I confirmed the effect of this optimizations. First, when I performed pgbench (SF=100) without any materialized views, the results is : pgbench test4 -T 300 -c 8 -j 4 latency average = 6.493 ms tps = 1232.146229 (including connections establishing) Next, created a view as below, I performed the same pgbench. CREATE INCREMENTAL MATERIALIZED VIEW mv_ivm2 AS SELECT bid, count(abalance), sum(abalance), avg(abalance) FROM pgbench_accounts GROUP BY bid; The result is here: [the previous version (v19 with exclusive table lock)] - latency average = 77.677 ms - tps = 102.990159 (including connections establishing) [In the latest version (v20 with weaker lock)] - latency average = 17.576 ms - tps = 455.159644 (including connections establishing) There is still substantial overhead, but we can see that the effect of the optimization. Regards, Yugo Nagata -- Yugo NAGATA IVM_patches_v20.tar.gz Description: application/gzip
Re: Implementing Incremental View Maintenance
On Wed, 25 Nov 2020 18:00:16 +0300 Konstantin Knizhnik wrote: > > > On 25.11.2020 16:06, Yugo NAGATA wrote: > > On Wed, 25 Nov 2020 15:16:05 +0300 > > Konstantin Knizhnik wrote: > > > >> > >> On 24.11.2020 13:11, Yugo NAGATA wrote: > I wonder if it is possible to somehow use predicate locking mechanism of > Postgres to avoid this anomalies without global lock? > >>> You mean that, ,instead of using any table lock, if any possibility of the > >>> anomaly is detected using predlock mechanism then abort the transaction? > >> Yes. If both transactions are using serializable isolation level, then > >> lock is not needed, isn't it? > >> So at least you can add yet another simple optimization: if transaction > >> has serializable isolation level, > >> then exclusive lock is not required. > > As long as we use the trigger approach, we can't handle concurrent view > > maintenance > > in either repeatable read or serializable isolation level. It is because > > one > > transaction (R= R+dR) cannot see changes occurred in another transaction > > (S'= S+dS) > > in such cases, and we cannot get the incremental change on the view > > (dV=dR*dS). > > Therefore, in the current implementation, the transaction is aborted when > > the > > concurrent view maintenance happens in repeatable read or serializable. > > Sorry, may be I do not correctly understand you or you do not understand me. > Lets consider two serializable transactions (I do not use view or > triggers, but perform correspondent updates manually): > > > > create table t(pk integer, val int); > create table mat_view(gby_key integer primary key, total bigint); > insert into t values (1,0),(2,0); > insert into mat_view values (1,0),(2,0); > > Session 1: Session 2: > > begin isolation level serializable; > begin isolation level serializable; > insert into t values (1,200); insert into t > values (1,300); > update mat_view set total=total+200 where gby_key=1; > update mat_view set total=total+300 where gby_key=1; > > commit; > ERROR: could not serialize access due to concurrent update > > So both transactions are aborted. > It is expected behavior for serializable transactions. > But if transactions updating different records of mat_view, then them > can be executed concurrently: > > Session 1: Session 2: > > begin isolation level serializable; > begin isolation level serializable; > insert into t values (1,200); insert into t > values (2,300); > update mat_view set total=total+200 where gby_key=1; > update mat_view set total=total+300 where gby_key=2; > commit; commit; > > So, if transactions are using serializable isolation level, then we can > update mat view without exclusive lock > and if there is not conflict, this transaction can be executed concurrently. > > Please notice, that exclusive lock doesn't prevent conflict in first case: > > Session 1: Session 2: > > begin isolation level serializable; > begin isolation level serializable; > insert into t values (1,200); insert into t > values (1,300); > lock table mat_view; > update mat_view set total=total+200 where gby_key=1; > lock table mat_view; > > commit; > update mat_view set total=total+300 where gby_key=1; > commit; > ERROR: could not serialize access due to concurrent update > > > So do you agree that there are no reasons for using explicit lock for > serializable transactions? Yes, I agree. I said an anomaly could occur in repeatable read and serializable isolation level, but it was wrong. In serializable, the transaction will be aborted in programmable cases due to predicate locks, and we don't need the lock. However, in repeatable read, the anomaly still could occurs when the table is defined on more than one base tables even if we lock the view. To prevent it, the only way I found is aborting the transaction forcedly in such cases for now. Regards, Yugo Nagata -- Yugo NAGATA
Re: Implementing Incremental View Maintenance
On 25.11.2020 16:06, Yugo NAGATA wrote: On Wed, 25 Nov 2020 15:16:05 +0300 Konstantin Knizhnik wrote: On 24.11.2020 13:11, Yugo NAGATA wrote: I wonder if it is possible to somehow use predicate locking mechanism of Postgres to avoid this anomalies without global lock? You mean that, ,instead of using any table lock, if any possibility of the anomaly is detected using predlock mechanism then abort the transaction? Yes. If both transactions are using serializable isolation level, then lock is not needed, isn't it? So at least you can add yet another simple optimization: if transaction has serializable isolation level, then exclusive lock is not required. As long as we use the trigger approach, we can't handle concurrent view maintenance in either repeatable read or serializable isolation level. It is because one transaction (R= R+dR) cannot see changes occurred in another transaction (S'= S+dS) in such cases, and we cannot get the incremental change on the view (dV=dR*dS). Therefore, in the current implementation, the transaction is aborted when the concurrent view maintenance happens in repeatable read or serializable. Sorry, may be I do not correctly understand you or you do not understand me. Lets consider two serializable transactions (I do not use view or triggers, but perform correspondent updates manually): create table t(pk integer, val int); create table mat_view(gby_key integer primary key, total bigint); insert into t values (1,0),(2,0); insert into mat_view values (1,0),(2,0); Session 1: Session 2: begin isolation level serializable; begin isolation level serializable; insert into t values (1,200); insert into t values (1,300); update mat_view set total=total+200 where gby_key=1; update mat_view set total=total+300 where gby_key=1; commit; ERROR: could not serialize access due to concurrent update So both transactions are aborted. It is expected behavior for serializable transactions. But if transactions updating different records of mat_view, then them can be executed concurrently: Session 1: Session 2: begin isolation level serializable; begin isolation level serializable; insert into t values (1,200); insert into t values (2,300); update mat_view set total=total+200 where gby_key=1; update mat_view set total=total+300 where gby_key=2; commit; commit; So, if transactions are using serializable isolation level, then we can update mat view without exclusive lock and if there is not conflict, this transaction can be executed concurrently. Please notice, that exclusive lock doesn't prevent conflict in first case: Session 1: Session 2: begin isolation level serializable; begin isolation level serializable; insert into t values (1,200); insert into t values (1,300); lock table mat_view; update mat_view set total=total+200 where gby_key=1; lock table mat_view; commit; update mat_view set total=total+300 where gby_key=1; commit; ERROR: could not serialize access due to concurrent update So do you agree that there are no reasons for using explicit lock for serializable transactions? -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: Implementing Incremental View Maintenance
On Wed, 25 Nov 2020 15:16:05 +0300 Konstantin Knizhnik wrote: > > > On 24.11.2020 13:11, Yugo NAGATA wrote: > > > >> I wonder if it is possible to somehow use predicate locking mechanism of > >> Postgres to avoid this anomalies without global lock? > > You mean that, ,instead of using any table lock, if any possibility of the > > anomaly is detected using predlock mechanism then abort the transaction? > > Yes. If both transactions are using serializable isolation level, then > lock is not needed, isn't it? > So at least you can add yet another simple optimization: if transaction > has serializable isolation level, > then exclusive lock is not required. As long as we use the trigger approach, we can't handle concurrent view maintenance in either repeatable read or serializable isolation level. It is because one transaction (R= R+dR) cannot see changes occurred in another transaction (S'= S+dS) in such cases, and we cannot get the incremental change on the view (dV=dR*dS). Therefore, in the current implementation, the transaction is aborted when the concurrent view maintenance happens in repeatable read or serializable. > But I wonder if we can go further so that even if transaction is using > read-committed or repeatable-read isolation level, > we still can replace exclusive table lock with predicate locks. > > The main problem with this approach (from my point of view) is the > predicate locks are able to detect conflict but not able to prevent it. > I.e. if such conflict is detected then transaction has to be aborted. > And it is not always desirable, especially because user doesn't expect > it: how can insertion of single record with unique keys in a table cause > transaction conflict? > And this is what will happen in your example with transactions T1 and T2 > inserting records in R and S tables. Yes. I wonder that either aborting transaction or waiting on locks is unavoidable when a view is incrementally updated concurrently (at least in the immediate maintenance where a view is update in the same transaction that updates the base table). > And what do you think about backrgound update of materialized view? > On update/insert trigger will just add record to some "delta" table and > then some background worker will update view. > Certainly in this case we loose synchronization between main table and > materialized view (last one may contain slightly deteriorated data). > But in this case no exclusive lock is needed, isn't it? Of course, we are considering this type of view maintenance. This is deferred maintenance where a view is update after the transaction that updates the base tables is committed. Views can be updated in bacground in a appropreate timing or as a response of a user command. To implement this, we needs a mechanism to maintain change logs which records changes of base tables. We think that implementing this infrastructure is not trivial work, so, in the first patch proposal, we decided to start from immediate approach which needs less code. -- Yugo NAGATA
Re: Implementing Incremental View Maintenance
On 24.11.2020 13:11, Yugo NAGATA wrote: I wonder if it is possible to somehow use predicate locking mechanism of Postgres to avoid this anomalies without global lock? You mean that, ,instead of using any table lock, if any possibility of the anomaly is detected using predlock mechanism then abort the transaction? Yes. If both transactions are using serializable isolation level, then lock is not needed, isn't it? So at least you can add yet another simple optimization: if transaction has serializable isolation level, then exclusive lock is not required. But I wonder if we can go further so that even if transaction is using read-committed or repeatable-read isolation level, we still can replace exclusive table lock with predicate locks. The main problem with this approach (from my point of view) is the predicate locks are able to detect conflict but not able to prevent it. I.e. if such conflict is detected then transaction has to be aborted. And it is not always desirable, especially because user doesn't expect it: how can insertion of single record with unique keys in a table cause transaction conflict? And this is what will happen in your example with transactions T1 and T2 inserting records in R and S tables. And what do you think about backrgound update of materialized view? On update/insert trigger will just add record to some "delta" table and then some background worker will update view. Certainly in this case we loose synchronization between main table and materialized view (last one may contain slightly deteriorated data). But in this case no exclusive lock is needed, isn't it? -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: Implementing Incremental View Maintenance
On Tue, 24 Nov 2020 12:46:57 +0300 Konstantin Knizhnik wrote: > > > On 24.11.2020 12:21, Yugo NAGATA wrote: > > > >> I replaced it with RowExlusiveLock and ... got 1437 TPS with 10 > >> connections. > >> It is still about 7 times slower than performance without incremental view. > >> But now the gap is not so dramatic. And it seems to be clear that this > >> exclusive lock on matview is real show stopper for concurrent updates. > >> I do not know which race conditions and anomalies we can get if replace > >> table-level lock with row-level lock here. > > I explained it here: > > https://www.postgresql.org/message-id/20200909092752.c91758a1bec3479668e82643%40sraoss.co.jp > > > > For example, suppose there is a view V = R*S that joins tables R and S, > > and there are two concurrent transactions T1 which changes table R to R' > > and T2 which changes S to S'. Without any lock, in READ COMMITTED mode, > > V would be updated to R'*S in T1, and R*S' in T2, so it would cause > > inconsistency. By locking the view V, transactions T1, T2 are processed > > serially and this inconsistency can be avoided. > > > > Especially, suppose that tuple dR is inserted into R in T1, and dS is > > inserted into S in T2, where dR and dS will be joined in according to > > the view definition. In this situation, without any lock, the change of V is > > computed as dV=dR*S in T1, dV=R*dS in T2, respectively, and dR*dS would not > > be included in the results. This inconsistency could not be resolved by > > row-level lock. > > > >> But I think that this problem should be addressed in any case: single > >> client update mode is very rare scenario. > > This behavior is explained in rules.sgml like this: > > > > + > > +Concurrent Transactions > > + > > +Suppose an IMMV is defined on two base tables and > > each > > +table was modified in different a concurrent transaction > > simultaneously. > > +In the transaction which was committed first, IMMV > > can > > +be updated considering only the change which happened in this > > transaction. > > +On the other hand, in order to update the view correctly in the > > transaction > > +which was committed later, we need to know the changes occurred in > > +both transactions. For this reason, ExclusiveLock > > +is held on an IMMV immediately after a base table is > > +modified in READ COMMITTED mode to make sure that > > +the IMMV is updated in the latter transaction after > > +the former transaction is committed. In REPEATABLE > > READ > > +or SERIALIZABLE mode, an error is raised immediately > > +if lock acquisition fails because any changes which occurred in > > +other transactions are not be visible in these modes and > > +IMMV cannot be updated correctly in such situations. > > + > > + > > > > Hoever, should we describe explicitly its impact on performance here? > > > > Sorry, I didn't think much about this problem. > But I think that it is very important to try to find some solution of > the problem. > The most obvious optimization is not to use exclusive table lock if view > depends just on one table (contains no joins). > Looks like there are no any anomalies in this case, are there? Thank you for your suggestion! That makes sense. > Yes, most analytic queries contain joins (just two queries among 22 > TPC-H have no joins). > So may be this optimization will not help much. Yes, but if a user want to incrementally maintain only aggregate views on a large table, like TPC-H Q1, it will be helpful. For this optimization, we have to only check the number of RTE in the rtable list and it would be cheap. > I wonder if it is possible to somehow use predicate locking mechanism of > Postgres to avoid this anomalies without global lock? You mean that, ,instead of using any table lock, if any possibility of the anomaly is detected using predlock mechanism then abort the transaction? I don't have concrete idea to implement it and know if it is possible yet, but I think it is worth to consider this. Thanks. Regards, Yugo Nagata -- Yugo NAGATA
Re: Implementing Incremental View Maintenance
On 24.11.2020 12:21, Yugo NAGATA wrote: I replaced it with RowExlusiveLock and ... got 1437 TPS with 10 connections. It is still about 7 times slower than performance without incremental view. But now the gap is not so dramatic. And it seems to be clear that this exclusive lock on matview is real show stopper for concurrent updates. I do not know which race conditions and anomalies we can get if replace table-level lock with row-level lock here. I explained it here: https://www.postgresql.org/message-id/20200909092752.c91758a1bec3479668e82643%40sraoss.co.jp For example, suppose there is a view V = R*S that joins tables R and S, and there are two concurrent transactions T1 which changes table R to R' and T2 which changes S to S'. Without any lock, in READ COMMITTED mode, V would be updated to R'*S in T1, and R*S' in T2, so it would cause inconsistency. By locking the view V, transactions T1, T2 are processed serially and this inconsistency can be avoided. Especially, suppose that tuple dR is inserted into R in T1, and dS is inserted into S in T2, where dR and dS will be joined in according to the view definition. In this situation, without any lock, the change of V is computed as dV=dR*S in T1, dV=R*dS in T2, respectively, and dR*dS would not be included in the results. This inconsistency could not be resolved by row-level lock. But I think that this problem should be addressed in any case: single client update mode is very rare scenario. This behavior is explained in rules.sgml like this: + +Concurrent Transactions + +Suppose an IMMV is defined on two base tables and each +table was modified in different a concurrent transaction simultaneously. +In the transaction which was committed first, IMMV can +be updated considering only the change which happened in this transaction. +On the other hand, in order to update the view correctly in the transaction +which was committed later, we need to know the changes occurred in +both transactions. For this reason, ExclusiveLock +is held on an IMMV immediately after a base table is +modified in READ COMMITTED mode to make sure that +the IMMV is updated in the latter transaction after +the former transaction is committed. In REPEATABLE READ +or SERIALIZABLE mode, an error is raised immediately +if lock acquisition fails because any changes which occurred in +other transactions are not be visible in these modes and +IMMV cannot be updated correctly in such situations. + + Hoever, should we describe explicitly its impact on performance here? Sorry, I didn't think much about this problem. But I think that it is very important to try to find some solution of the problem. The most obvious optimization is not to use exclusive table lock if view depends just on one table (contains no joins). Looks like there are no any anomalies in this case, are there? Yes, most analytic queries contain joins (just two queries among 22 TPC-H have no joins). So may be this optimization will not help much. I wonder if it is possible to somehow use predicate locking mechanism of Postgres to avoid this anomalies without global lock? -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: Implementing Incremental View Maintenance
On Thu, 12 Nov 2020 15:37:42 +0300 Konstantin Knizhnik wrote: > Well, creation of proper indexes for table is certainly responsibility > of DBA. > But users may not consider materialized view as normal table. So the > idea that index should > be explicitly created for materialized view seems to be not so obvious. > From the other side, implementation of materialized view knows which > index is needed for performing efficient incremental update. > I wonder if it can create such index itself implicitly or at least > produce notice with proposal to create such index. That makes sense. Especially for aggregate views, it is obvious that creating an index on expressions used in GROUP BY is effective. For other views, creating an index on columns that come from primary keys of base tables would be effective if any. However, if any base table doesn't have a primary or unique key or such key column is not contained in the view's target list, it is hard to decide an appropriate index on the view. We can create an index on all columns in the target list, but it could cause overhead on view maintenance. So, just producing notice would be better for such cases. > I looked throw your patch for exclusive table locks and found this > fragment in matview.c: > > /* > * Wait for concurrent transactions which update this materialized > view at > * READ COMMITED. This is needed to see changes committed in other > * transactions. No wait and raise an error at REPEATABLE READ or > * SERIALIZABLE to prevent update anomalies of matviews. > * XXX: dead-lock is possible here. > */ > if (!IsolationUsesXactSnapshot()) > LockRelationOid(matviewOid, ExclusiveLock); > else if (!ConditionalLockRelationOid(matviewOid, ExclusiveLock)) > > > I replaced it with RowExlusiveLock and ... got 1437 TPS with 10 connections. > It is still about 7 times slower than performance without incremental view. > But now the gap is not so dramatic. And it seems to be clear that this > exclusive lock on matview is real show stopper for concurrent updates. > I do not know which race conditions and anomalies we can get if replace > table-level lock with row-level lock here. I explained it here: https://www.postgresql.org/message-id/20200909092752.c91758a1bec3479668e82643%40sraoss.co.jp For example, suppose there is a view V = R*S that joins tables R and S, and there are two concurrent transactions T1 which changes table R to R' and T2 which changes S to S'. Without any lock, in READ COMMITTED mode, V would be updated to R'*S in T1, and R*S' in T2, so it would cause inconsistency. By locking the view V, transactions T1, T2 are processed serially and this inconsistency can be avoided. Especially, suppose that tuple dR is inserted into R in T1, and dS is inserted into S in T2, where dR and dS will be joined in according to the view definition. In this situation, without any lock, the change of V is computed as dV=dR*S in T1, dV=R*dS in T2, respectively, and dR*dS would not be included in the results. This inconsistency could not be resolved by row-level lock. > But I think that this problem should be addressed in any case: single > client update mode is very rare scenario. This behavior is explained in rules.sgml like this: + +Concurrent Transactions + +Suppose an IMMV is defined on two base tables and each +table was modified in different a concurrent transaction simultaneously. +In the transaction which was committed first, IMMV can +be updated considering only the change which happened in this transaction. +On the other hand, in order to update the view correctly in the transaction +which was committed later, we need to know the changes occurred in +both transactions. For this reason, ExclusiveLock +is held on an IMMV immediately after a base table is +modified in READ COMMITTED mode to make sure that +the IMMV is updated in the latter transaction after +the former transaction is committed. In REPEATABLE READ +or SERIALIZABLE mode, an error is raised immediately +if lock acquisition fails because any changes which occurred in +other transactions are not be visible in these modes and +IMMV cannot be updated correctly in such situations. + + Hoever, should we describe explicitly its impact on performance here? > I attached to this mail profile of pgbench workload with defined > incremental view (with index). > May be you will find it useful. Thank you for your profiling! Hmm, it shows that overhead of executing query for calculating the delta (refresh_mateview_datfill) and applying the delta (SPI_exec) is large I will investigate if more optimizations to reduce the overhead is possible. > > One more disappointing observation of materialized views (now > non-incremental). > Time of creation of non-incremental materialized view is about 18 seconds: > > postgres=# create materialized view teller_avgs
Re: Implementing Incremental View Maintenance
On Wed, 11 Nov 2020 19:10:35 +0300 Konstantin Knizhnik wrote: Thank you for reviewing this patch! > > The patch is not applied to the current master because makeFuncCall > prototype is changed, > I fixed it by adding COAERCE_CALL_EXPLICIT. The rebased patch was submitted. > Ooops! Now TPS are much lower: > > tps = 141.767347 (including connections establishing) > > Speed of updates is reduced more than 70 times! > Looks like we loose parallelism because almost the same result I get > with just one connection. As you and Ishii-san mentioned in other posts, I think the reason would be a table lock on the materialized view that is acquired during view maintenance. I will explain more a bit in another post. > 4. Finally let's create one more view (it is reasonable to expect that > analytics will run many different queries and so need multiple views). > > create incremental materialized view teller_avgs as select > t.tid,avg(abalance) from pgbench_accounts a join pgbench_tellers t on > a.bid=t.bid group by t.tid; > > It is great that not only simple aggregates like SUM are supported, but > also AVG. > But insertion speed now is reduced twice - 72TPS. Yes, the current implementation takes twice time for updating a table time when a new incrementally maintainable materialized view is defined on the table because view maintenance is performed for each view. > > So good news is that incremental materialized views really work. > And bad news is that maintenance overhead is too large which > significantly restrict applicability of this approach. > Certainly in case of dominated read-only workload such materialized > views can significantly improve performance. > But unfortunately my dream that them allow to combine OLAP+OLPT is not > currently realized. As you concluded, there is a large overhead on updating base tables in the current implementation because it is immediate maintenance in which the view is updated in the same sentence where its base table is modified. Therefore, this is not suitable to OLTP workload where there are frequent updates of tables. For suppressing maintenance overhead in such workload, we have to implement "deferred maintenance" which collects table change logs and updates the view in another transaction afterward. Regards, Yugo Nagata -- Yugo NAGATA
Re: Implementing Incremental View Maintenance
> 1. Create pgbench database with scale 100. > pgbench speed at my desktop is about 10k TPS: > > pgbench -M prepared -N -c 10 -j 4 -T 30 -P 1 postgres > tps = 10194.951827 (including connections establishing) > > 2. Then I created incremental materialized view: > > create incremental materialized view teller_sums as select > t.tid,sum(abalance) from pgbench_accounts a join pgbench_tellers t on > a.bid=t.bid group by t.tid; > SELECT 1000 > Time: 20805.230 ms (00:20.805) > > 20 second is reasonable time, comparable with time of database > initialization. > > Then obviously we see advantages of precalculated aggregates: > > postgres=# select * from teller_sums where tid=1; > tid | sum > -+ > 1 | -96427 > (1 row) > > Time: 0.871 ms > postgres=# select t.tid,sum(abalance) from pgbench_accounts a join > pgbench_tellers t on a.bid=t.bid group by t.tid having t.tid=1 > ; > tid | sum > -+ > 1 | -96427 > (1 row) > > Time: 915.508 ms > > Amazing. Almost 1000 times difference! > > 3. Run pgbench once again: > > Ooops! Now TPS are much lower: > > tps = 141.767347 (including connections establishing) > > Speed of updates is reduced more than 70 times! > Looks like we loose parallelism because almost the same result I get > with just one connection. How much TPS do you get if you execute pgbench -c 1 without incremental materialized view defined? If it's around 141 then we could surely confirm that the major bottle neck is locking contention. Best regards, -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese:http://www.sraoss.co.jp
Re: Implementing Incremental View Maintenance
On Thu, 5 Nov 2020 22:58:25 -0600 Justin Pryzby wrote: > On Mon, Oct 05, 2020 at 06:16:18PM +0900, Yugo NAGATA wrote: > This needs to be rebased again - the last version doesn't apply anymore. > http://cfbot.cputube.org/yugo-nagata.html I attached the rebased patch (v19). > I looked though it a bit and attach some fixes to the user-facing docs. Thank you for pointing out a lot of typos and making the patch to fix it! Your fixes are included in the latest patch. > There's some more typos in the source that I didn't fix: > constrains > materliazied > cluase > immediaite > clumn > Temrs > migth > recalculaetd > speified > secuirty > > commit message: comletion > > psql and pg_dump say 13 but should say 14 now: > pset.sversion >= 13 These were also fixed. > # bag union > big union? "bag union" is union operation of bag (multi-set) that does not eliminate duplicate of tuples. > + relisivm bool > + > + > + True if materialized view enables incremental view maintenance > > This isn't clear, but I think it should say "True for materialized views which > are enabled for incremental view maintenance (IVM)." Yes, you are right. I also fixed it in this way. Regards, Yugo Nagata -- Yugo NAGATA IVM_patches_v19.tar.gz Description: application/gzip
Re: Implementing Incremental View Maintenance
Hello Konstantin, I remember testing it with pg_stat_statements (and planning counters enabled). Maybe identifying internal queries associated with this (simple) test case, could help dev team ? Regards PAscal -- Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
Re: Implementing Incremental View Maintenance
On 05.10.2020 12:16, Yugo NAGATA wrote: Hi, Attached is the rebased patch (v18) to add support for Incremental Materialized View Maintenance (IVM). It is able to be applied to current latest master branch. Thank you very much for this work. I consider incremental materialized views as "reincarnation" of OLAP hypercubes. There are two approaches of making OLAP queries faster: 1. speed up query execution (using JIT, columnar store, vector operations and parallel execution) 2. precalculate requested data Incremental materialize views make it possible to implement second approach. But how competitive it is? I do not know current limitations of incremental materialized views, but I checked that basic OLAP functionality: JOIN+GROUP_BY+AGGREGATION is supported. The patch is not applied to the current master because makeFuncCall prototype is changed, I fixed it by adding COAERCE_CALL_EXPLICIT. Then I did the following simple test: 1. Create pgbench database with scale 100. pgbench speed at my desktop is about 10k TPS: pgbench -M prepared -N -c 10 -j 4 -T 30 -P 1 postgres tps = 10194.951827 (including connections establishing) 2. Then I created incremental materialized view: create incremental materialized view teller_sums as select t.tid,sum(abalance) from pgbench_accounts a join pgbench_tellers t on a.bid=t.bid group by t.tid; SELECT 1000 Time: 20805.230 ms (00:20.805) 20 second is reasonable time, comparable with time of database initialization. Then obviously we see advantages of precalculated aggregates: postgres=# select * from teller_sums where tid=1; tid | sum -+ 1 | -96427 (1 row) Time: 0.871 ms postgres=# select t.tid,sum(abalance) from pgbench_accounts a join pgbench_tellers t on a.bid=t.bid group by t.tid having t.tid=1 ; tid | sum -+ 1 | -96427 (1 row) Time: 915.508 ms Amazing. Almost 1000 times difference! 3. Run pgbench once again: Ooops! Now TPS are much lower: tps = 141.767347 (including connections establishing) Speed of updates is reduced more than 70 times! Looks like we loose parallelism because almost the same result I get with just one connection. 4. Finally let's create one more view (it is reasonable to expect that analytics will run many different queries and so need multiple views). create incremental materialized view teller_avgs as select t.tid,avg(abalance) from pgbench_accounts a join pgbench_tellers t on a.bid=t.bid group by t.tid; It is great that not only simple aggregates like SUM are supported, but also AVG. But insertion speed now is reduced twice - 72TPS. I tried to make some profiling but didn't see something unusual: 16.41% postgres postgres [.] ExecInterpExpr 8.78% postgres postgres [.] slot_deform_heap_tuple 3.23% postgres postgres [.] ExecMaterial 2.71% postgres postgres [.] AllocSetCheck 2.33% postgres postgres [.] AllocSetAlloc 2.29% postgres postgres [.] slot_getsomeattrs_int 2.26% postgres postgres [.] ExecNestLoop 2.11% postgres postgres [.] MemoryContextReset 1.98% postgres postgres [.] tts_minimal_store_tuple 1.87% postgres postgres [.] heap_compute_data_size 1.78% postgres postgres [.] fill_val 1.56% postgres postgres [.] tuplestore_gettuple 1.44% postgres postgres [.] sentinel_ok 1.35% postgres postgres [.] heap_fill_tuple 1.27% postgres postgres [.] tuplestore_gettupleslot 1.17% postgres postgres [.] ExecQual 1.14% postgres postgres [.] tts_minimal_clear 1.13% postgres postgres [.] CheckOpSlotCompatibility 1.10% postgres postgres [.] base_yyparse 1.10% postgres postgres [.] heapgetpage 1.04% postgres postgres [.] heap_form_minimal_tuple 1.00% postgres postgres [.] slot_getsomeattrs So good news is that incremental materialized views really work. And bad news is that maintenance overhead is too large which significantly restrict applicability of this approach. Certainly in case of dominated read-only workload such materialized views can significantly improve performance. But unfortunately my dream that them allow to combine OLAP+OLPT is not currently realized. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: Implementing Incremental View Maintenance
On Mon, Oct 05, 2020 at 06:16:18PM +0900, Yugo NAGATA wrote: > Hi, > > Attached is the rebased patch (v18) to add support for Incremental This needs to be rebased again - the last version doesn't apply anymore. http://cfbot.cputube.org/yugo-nagata.html I looked though it a bit and attach some fixes to the user-facing docs. There's some more typos in the source that I didn't fix: constrains materliazied cluase immediaite clumn Temrs migth recalculaetd speified secuirty commit message: comletion psql and pg_dump say 13 but should say 14 now: pset.sversion >= 13 # bag union big union? + relisivm bool + + + True if materialized view enables incremental view maintenance This isn't clear, but I think it should say "True for materialized views which are enabled for incremental view maintenance (IVM)." -- Justin >From 568f8626e2d9ab0deb25ac9e10089a79abecdab0 Mon Sep 17 00:00:00 2001 From: Justin Pryzby Date: Thu, 5 Nov 2020 22:54:23 -0600 Subject: [PATCH] incremental view doc fixes --- doc/src/sgml/catalogs.sgml| 2 +- .../sgml/ref/create_materialized_view.sgml| 44 +++--- .../sgml/ref/refresh_materialized_view.sgml | 4 +- doc/src/sgml/rules.sgml | 132 +- 4 files changed, 90 insertions(+), 92 deletions(-) diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml index 9b52119382..b942605a1b 100644 --- a/doc/src/sgml/catalogs.sgml +++ b/doc/src/sgml/catalogs.sgml @@ -3479,7 +3479,7 @@ SCRAM-SHA-256$iteration count: The dependent object was created as part of creation of the Materialized View with Incremental View Maintenance reference, and is really just a part of its internal implementation. The dependent object must not be - dropped unless materialized view dropped. + dropped unless the materialized view is also dropped. diff --git a/doc/src/sgml/ref/create_materialized_view.sgml b/doc/src/sgml/ref/create_materialized_view.sgml index 6f4e634ab0..e034a2c17f 100644 --- a/doc/src/sgml/ref/create_materialized_view.sgml +++ b/doc/src/sgml/ref/create_materialized_view.sgml @@ -67,7 +67,7 @@ CREATE [ INCREMENTAL ] MATERIALIZED VIEW [ IF NOT EXISTS ] table_na There are restrictions of query definitions allowed to use this - option. Followings are supported query definitions for IMMV: + option. The following are supported in query definitions for IMMV: @@ -78,12 +78,12 @@ CREATE [ INCREMENTAL ] MATERIALIZED VIEW [ IF NOT EXISTS ] table_na - Outer joins with following restrictions: + Outer joins with the following restrictions: -Outer join view's targetlist must contain attributes used in the +Outer join view's targetlist must contain all attributes used in the join conditions. CREATE INCREMENTAL MATERIALIZED VIEW mv AS SELECT a.i FROM mv_base_a a LEFT @@ -95,7 +95,7 @@ ERROR: targetlist must contain vars in the join condition for IVM with outer jo -Outer join view's targetlist cannot contain non strict functions. +Outer join view's targetlist cannot contain non-strict functions. CREATE INCREMENTAL MATERIALIZED VIEW mv AS SELECT a.i, b.i, (k > 10 OR k = -1) FROM mv_base_a a LEFT JOIN mv_base_b b ON a.i=b.i; @@ -135,7 +135,7 @@ ERROR: WHERE cannot contain non null-rejecting predicates for IVM with outer jo -Aggregate is not supported with outer join. +Aggregates are not supported with outer join. CREATE INCREMENTAL MATERIALIZED VIEW mv(a,b,v) AS SELECT a.i, b.i, sum(k) FROM mv_base_a a LEFT JOIN mv_base_b b ON a.i=b.i GROUP BY a.i, b.i; @@ -165,7 +165,7 @@ ERROR: subquery is not supported by IVM together with outer join - Subqueries. However following forms are not supported. + Subqueries. However, the following forms are not supported. @@ -176,14 +176,14 @@ mv_base_a WHERE i IN (SELECT i FROM mv_base_b WHERE k 103 ); - subqueries in target list is not supported: + subqueries in the target list are not supported: CREATE INCREMENTAL MATERIALIZED VIEW mv_ivm05 AS SELECT i,j, (SELECT k FROM mv_base_b b WHERE a.i = b.i) FROM mv_base_a a; - Nested EXISTS subqueries is not supported: + Nested EXISTS subqueries are not supported: CREATE INCREMENTAL MATERIALIZED VIEW mv_ivm11 AS SELECT a.i,a.j FROM mv_base_a a WHERE EXISTS(SELECT 1 FROM mv_base_b b WHERE EXISTS(SELECT @@ -210,14 +210,14 @@ a.i 5; - Simple CTEs which do not contain aggregates or DISTINCT. + Simple CTEs that do not contain aggregates or DISTINCT.
Re: Implementing Incremental View Maintenance
On Wed, 28 Oct 2020 12:01:58 +0300 Anastasia Lubennikova wrote: > ср, 28 окт. 2020 г. в 08:02, Yugo NAGATA : > > > Hi Anastasia Lubennikova, > > > > I am writing this to you because I would like to ask the commitfest > > manager something. > > > > The status of the patch was changed to "Waiting on Author" from > > "Ready for Committer" at the beginning of this montfor the reason > > that rebase was necessary. Now I updated the patch, so can I change > > the status back to "Ready for Committer"? > > > > Regards, > > Yugo Nagata > > > > > Yes, go ahead. As far as I see, the patch is in a good shape and there are > no unanswered questions from reviewers. > Feel free to change the status of CF entries, when it seems reasonable to > you. Thank you for your response! I get it. > P.S. Please, avoid top-posting, It makes it harder to follow the > discussion, in-line replies are customary in pgsql mailing lists. > See https://en.wikipedia.org/wiki/Posting_style#Top-posting for details. I understand it. Regards, Yugo Nagata > -- > Best regards, > Lubennikova Anastasia -- Yugo NAGATA
Re: Implementing Incremental View Maintenance
ср, 28 окт. 2020 г. в 08:02, Yugo NAGATA : > Hi Anastasia Lubennikova, > > I am writing this to you because I would like to ask the commitfest > manager something. > > The status of the patch was changed to "Waiting on Author" from > "Ready for Committer" at the beginning of this montfor the reason > that rebase was necessary. Now I updated the patch, so can I change > the status back to "Ready for Committer"? > > Regards, > Yugo Nagata > > Yes, go ahead. As far as I see, the patch is in a good shape and there are no unanswered questions from reviewers. Feel free to change the status of CF entries, when it seems reasonable to you. P.S. Please, avoid top-posting, It makes it harder to follow the discussion, in-line replies are customary in pgsql mailing lists. See https://en.wikipedia.org/wiki/Posting_style#Top-posting for details. -- Best regards, Lubennikova Anastasia
Re: Implementing Incremental View Maintenance
On Tue, 27 Oct 2020 12:14:52 -0400 Adam Brusselback wrote: > That was a good bit more work to get ready than I expected. It's broken > into two scripts, one to create the schema, the other to load data and > containing a couple check queries to ensure things are working properly > (checking the materialized tables against a regular view for accuracy). Thank you very much! I am really grateful. > The first test case is to give us a definitive result on what "agreed > pricing" is in effect at a point in time based on a product hierarchy > our customers setup, and allow pricing to be set on nodes in that > hierarchy, as well as specific products (with an order of precedence). > The second test case maintains some aggregated amounts / counts / boolean > logic at an "invoice" level for all the detail lines which make up that > invoice. > > Both of these are real-world use cases which were simplified a bit to make > them easier to understand. We have other use cases as well, but with how > much time this took to prepare i'll keep it at this for now. > If you need anything clarified or have any issues, just let me know. Although I have not look into it in details yet, in my understanding, it seems that materialized views are used to show "pricing" or "invoice" information before the order is confirmed, that is, before the transaction is committed. Definitely, these will be use cases where immediate view maintenance is useful. I am happy because I found concrete use cases of immediate IVM. However, unfortunately, the view definitions in your cases are complex, and the current implementation of the patch doesn't support it. We would like to improve the feature in future so that more complex views could benefit from IVM. Regards, Yugo Nagata > On Fri, Oct 23, 2020 at 3:58 AM Yugo NAGATA wrote: > > > Hi Adam, > > > > On Thu, 22 Oct 2020 10:07:29 -0400 > > Adam Brusselback wrote: > > > > > Hey there Yugo, > > > I've asked a coworker to prepare a self contained example that > > encapsulates > > > our multiple use cases. > > > > Thank you very much! > > > > > The immediate/eager approach is exactly what we need, as within the same > > > transaction we have statements that can cause one of those "materialized > > > tables" to be updated, and then sometimes have the need to query that > > > "materialized table" in a subsequent statement and need to see the > > changes > > > reflected. > > > > The proposed patch provides the exact this feature and I think this will > > meet > > your needs. > > > > > As soon as my coworker gets that example built up I'll send a followup > > with > > > it attached. > > > > Great! We are looking forward to it. > > > > Regards, > > Yugo Nagata > > > > -- > > Yugo NAGATA > > -- Yugo NAGATA
Re: Implementing Incremental View Maintenance
Hi Anastasia Lubennikova, I am writing this to you because I would like to ask the commitfest manager something. The status of the patch was changed to "Waiting on Author" from "Ready for Committer" at the beginning of this montfor the reason that rebase was necessary. Now I updated the patch, so can I change the status back to "Ready for Committer"? Regards, Yugo Nagata On Mon, 5 Oct 2020 18:16:18 +0900 Yugo NAGATA wrote: > Hi, > > Attached is the rebased patch (v18) to add support for Incremental > Materialized View Maintenance (IVM). It is able to be applied to > current latest master branch. -- Yugo NAGATA
Re: Implementing Incremental View Maintenance
That was a good bit more work to get ready than I expected. It's broken into two scripts, one to create the schema, the other to load data and containing a couple check queries to ensure things are working properly (checking the materialized tables against a regular view for accuracy). The first test case is to give us a definitive result on what "agreed pricing" is in effect at a point in time based on a product hierarchy our customers setup, and allow pricing to be set on nodes in that hierarchy, as well as specific products (with an order of precedence). The second test case maintains some aggregated amounts / counts / boolean logic at an "invoice" level for all the detail lines which make up that invoice. Both of these are real-world use cases which were simplified a bit to make them easier to understand. We have other use cases as well, but with how much time this took to prepare i'll keep it at this for now. If you need anything clarified or have any issues, just let me know. On Fri, Oct 23, 2020 at 3:58 AM Yugo NAGATA wrote: > Hi Adam, > > On Thu, 22 Oct 2020 10:07:29 -0400 > Adam Brusselback wrote: > > > Hey there Yugo, > > I've asked a coworker to prepare a self contained example that > encapsulates > > our multiple use cases. > > Thank you very much! > > > The immediate/eager approach is exactly what we need, as within the same > > transaction we have statements that can cause one of those "materialized > > tables" to be updated, and then sometimes have the need to query that > > "materialized table" in a subsequent statement and need to see the > changes > > reflected. > > The proposed patch provides the exact this feature and I think this will > meet > your needs. > > > As soon as my coworker gets that example built up I'll send a followup > with > > it attached. > > Great! We are looking forward to it. > > Regards, > Yugo Nagata > > -- > Yugo NAGATA > 02_materialized_test_data.sql Description: Binary data 01_materialized_test_schema.sql Description: Binary data
Re: Implementing Incremental View Maintenance
Hi Adam, On Thu, 22 Oct 2020 10:07:29 -0400 Adam Brusselback wrote: > Hey there Yugo, > I've asked a coworker to prepare a self contained example that encapsulates > our multiple use cases. Thank you very much! > The immediate/eager approach is exactly what we need, as within the same > transaction we have statements that can cause one of those "materialized > tables" to be updated, and then sometimes have the need to query that > "materialized table" in a subsequent statement and need to see the changes > reflected. The proposed patch provides the exact this feature and I think this will meet your needs. > As soon as my coworker gets that example built up I'll send a followup with > it attached. Great! We are looking forward to it. Regards, Yugo Nagata -- Yugo NAGATA
Re: Implementing Incremental View Maintenance
Hey there Yugo, I've asked a coworker to prepare a self contained example that encapsulates our multiple use cases. The immediate/eager approach is exactly what we need, as within the same transaction we have statements that can cause one of those "materialized tables" to be updated, and then sometimes have the need to query that "materialized table" in a subsequent statement and need to see the changes reflected. As soon as my coworker gets that example built up I'll send a followup with it attached. Thank you, Adam Brusselback
Re: Implementing Incremental View Maintenance
Hi Adam Brusselback, On Mon, 31 Dec 2018 11:20:11 -0500 Adam Brusselback wrote: > Hi all, just wanted to say I am very happy to see progress made on this, > my codebase has multiple "materialized tables" which are maintained with > statement triggers (transition tables) and custom functions. They are ugly > and a pain to maintain, but they work because I have no other > solution...for now at least. We are want to find sutable use cases of the IVM patch being discussed in this thread, and I remembered your post that said you used statement triggers and custom functions. We hope the patch will help you. The patch implements IVM of immediate, that is, eager approach. Materialized views are updated immediately when its base tables are modified. While the view is always up-to-date, there is a overhead on base table modification. We would appreciate it if you could tell us what your use cases of materialized view is and whether our implementation suits your needs or not. Regards, Yugo Nagata -- Yugo NAGATA
Re: Implementing Incremental View Maintenance
> * Aggregate support > > The current patch supports several built-in aggregates, that is, count, sum, > avg, min, and max. Other built-in aggregates or user-defined aggregates are > not supported. > > Aggregates in a materialized view definition is checked if this is supported > using OIDs of aggregate function. For this end, Gen_fmgrtab.pl is changed to > output aggregate function's OIDs to fmgroids.h > (by 0006-Change-Gen_fmgrtab.pl-to-output-aggregate-function-s.patch). > The logic for each aggregate function to update aggregate values in > materialized views is enbedded in a trigger function. > > There was another option in the past discussion. That is, we could add one > or more new attribute to pg_aggregate which provides information about if > each aggregate function supports IVM and its logic[2]. If we have a mechanism > to support IVM in pg_aggregate, we may use more general aggregate functions > including user-defined aggregate in materialized views for IVM. > > For example, the current pg_aggregate has aggcombinefn attribute for > supporting partial aggregation. Maybe we could use combine functions to > calculate new aggregate values in materialized views when tuples are > inserted into a base table. However, in the context of IVM, we also need > other function used when tuples are deleted from a base table, so we can not > use partial aggregation for IVM in the current implementation. > > Maybe, we could support the deletion case by adding a new support function, > say, "inverse combine function". The "inverse combine function" would take > aggregate value in a materialized view and aggregate value calculated from a > delta of view, and produces the new aggregate value which equals the result > after tuples in a base table are deleted. > > However, we don't have concrete plan for the new design of pg_aggregate. > In addition, even if make a new support function in pg_aggregate for IVM, > we can't use this in the current IVM code because our code uses SQL via SPI > in order to update a materialized view and we can't call "internal" type > function directly in SQL. > > For these reasons, in the current patch, we decided to left supporting > general aggregates to the next version for simplicity, so the current patch > supports only some built-in aggregates and checks if they can be used in IVM > by their OIDs. Current patch for IVM is already large. I think implementing above will make the patch size even larger, which makes reviewer's work difficult. So I personally think we should commit the patch as it is, then enhance IVM to support user defined and other aggregates in later version of PostgreSQL. However, if supporting user defined and other aggregates is quite important for certain users, then we should rethink about this. It will be nice if we could know how high such demand is. Best regards, -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese:http://www.sraoss.co.jp
Re: Implementing Incremental View Maintenance
Hi, I have reviewed the past discussions in this thread on IVM implementation of the proposed patch[1], and summarized it as following . We would appreciate any comments or suggestions on the patch as regard of them. * Aggregate support The current patch supports several built-in aggregates, that is, count, sum, avg, min, and max. Other built-in aggregates or user-defined aggregates are not supported. Aggregates in a materialized view definition is checked if this is supported using OIDs of aggregate function. For this end, Gen_fmgrtab.pl is changed to output aggregate function's OIDs to fmgroids.h (by 0006-Change-Gen_fmgrtab.pl-to-output-aggregate-function-s.patch). The logic for each aggregate function to update aggregate values in materialized views is enbedded in a trigger function. There was another option in the past discussion. That is, we could add one or more new attribute to pg_aggregate which provides information about if each aggregate function supports IVM and its logic[2]. If we have a mechanism to support IVM in pg_aggregate, we may use more general aggregate functions including user-defined aggregate in materialized views for IVM. For example, the current pg_aggregate has aggcombinefn attribute for supporting partial aggregation. Maybe we could use combine functions to calculate new aggregate values in materialized views when tuples are inserted into a base table. However, in the context of IVM, we also need other function used when tuples are deleted from a base table, so we can not use partial aggregation for IVM in the current implementation. Maybe, we could support the deletion case by adding a new support function, say, "inverse combine function". The "inverse combine function" would take aggregate value in a materialized view and aggregate value calculated from a delta of view, and produces the new aggregate value which equals the result after tuples in a base table are deleted. However, we don't have concrete plan for the new design of pg_aggregate. In addition, even if make a new support function in pg_aggregate for IVM, we can't use this in the current IVM code because our code uses SQL via SPI in order to update a materialized view and we can't call "internal" type function directly in SQL. For these reasons, in the current patch, we decided to left supporting general aggregates to the next version for simplicity, so the current patch supports only some built-in aggregates and checks if they can be used in IVM by their OIDs. * Hidden columns For supporting aggregates, DISTINCT, and EXISTS, the current implementation automatically create hidden columns whose name starts with "__ivm_" in materialized views. The columns starting with "__ivm_" are hidden, so when "SELECT * FROM ..." is issued to a materialized view, these are invisible for users. Users can not use such name as a user column in materialized views with IVM support. As for how to make internal columns invisible to SELECT *, previously there have been discussions about doing that using a new flag in pg_attribute[3]. However, the discussion is no longer active. So, we decided to use column name for checking if this is special or not in our IVM implementation for now. * TRUNCATE support Currently, TRUNCATE on base tables are not supported. When TRUNCATE command is executed on a base table, it is ignored and nothing occurs on materialized views. There are another options as followings: - Raise an error or warning when a base table is TRUNCATEed. - Make the view non-scannable (like REFRESH WITH NO DATA) - Update the view in any way. It would be easy for inner joins or aggregate views, but there is some difficult with outer joins. Which is the best way? Should we support TRUNCATE in the first version? Any suggestions would be greatly appreciated. [1] https://wiki.postgresql.org/wiki/Incremental_View_Maintenance [2] https://www.postgresql.org/message-id/20191129173328.e5a0e9f81e369a3769c4fd0c%40sraoss.co.jp [3] https://www.postgresql.org/message-id/flat/CAEepm%3D3ZHh%3Dp0nEEnVbs1Dig_UShPzHUcMNAqvDQUgYgcDo-pA%40mail.gmail.com Regard, Yugo Nagata -- Yugo NAGATA
Re: Implementing Incremental View Maintenance
Hi, Attached is the rebased patch (v18) to add support for Incremental Materialized View Maintenance (IVM). It is able to be applied to current latest master branch. Also, this now supports simple CTEs (WITH clauses) which do not contain aggregates or DISTINCT like simple sub-queries. This feature is provided as an additional patch segment "0010-Add-CTE-support-in-IVM.patch". Example cte=# TABLE r; i | v ---+ 1 | 10 2 | 20 (2 rows) cte=# TABLE s; i | v ---+- 2 | 200 3 | 300 (2 rows) cte=# \d+ mv Materialized view "public.mv" Column | Type | Collation | Nullable | Default | Storage | Stats target | Description +-+---+--+-+-+--+- r | integer | | | | plain | | x | integer | | | | plain | | View definition: WITH x AS ( SELECT s.i, s.v FROM s ) SELECT r.v AS r, x.v AS x FROM r, x WHERE r.i = x.i; Access method: heap Incremental view maintenance: yes cte=# SELECT * FROM mv; r | x +- 20 | 200 (1 row) cte=# INSERT INTO r VALUES (3,30); INSERT 0 1 cte=# INSERT INTO s VALUES (1,100); INSERT 0 1 cte=# SELECT * FROM mv; r | x +- 20 | 200 30 | 300 10 | 100 (3 rows) == Regards, Yugo Nagata -- Yugo NAGATA IVM_patches_v18.tar.gz Description: application/gzip
Re: Implementing Incremental View Maintenance
On Thu, 1 Oct 2020 13:43:49 +0900 Fujii Masao wrote: > When I glanced the doc patch (i.e., 0012), I found some typos. Thank you for your pointing out typos! I'll fix it. > > +CRATE INCREMENTAL MATERIALIZED VIEW, for example: > > Typo: CRATE should be CREATE ? > > +with __ivm_ and they contains information required > > Typo: contains should be contain ? > > +For exmaple, here are two materialized views based on the same view > > Typo: exmaple should be example ? > > +maintenance can be lager than REFRESH MATERIALIZED > VIEW > > Typo: lager should be larger ? > > +postgres=# SELECt * FROM m; -- automatically updated > > Typo: SELECt should be SELECT ? > > Regards, > > -- > Fujii Masao > Advanced Computing Technology Center > Research and Development Headquarters > NTT DATA CORPORATION -- Yugo NAGATA
Re: Implementing Incremental View Maintenance
On 2020/10/01 13:03, Tatsuo Ishii wrote: On Thu, Sep 17, 2020 at 09:42:45AM +0900, Tatsuo Ishii wrote: I am asking because these patch sets are now getting closer to committable state in my opinion, and if there's someting wrong, it should be fixed soon so that these patches are getting into the master branch. I think this feature has been long awaited by users and merging the patches should be a benefit for them. I don't have much thoughts to offer about that, but this patch is failing to apply, so a rebase is at least necessary. Yes. I think he is going to post a new patch (possibly with enhancements) soon. When I glanced the doc patch (i.e., 0012), I found some typos. +CRATE INCREMENTAL MATERIALIZED VIEW, for example: Typo: CRATE should be CREATE ? +with __ivm_ and they contains information required Typo: contains should be contain ? +For exmaple, here are two materialized views based on the same view Typo: exmaple should be example ? +maintenance can be lager than REFRESH MATERIALIZED VIEW Typo: lager should be larger ? +postgres=# SELECt * FROM m; -- automatically updated Typo: SELECt should be SELECT ? Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
Re: Implementing Incremental View Maintenance
> On Thu, Sep 17, 2020 at 09:42:45AM +0900, Tatsuo Ishii wrote: >> I am asking because these patch sets are now getting closer to >> committable state in my opinion, and if there's someting wrong, it >> should be fixed soon so that these patches are getting into the master >> branch. >> >> I think this feature has been long awaited by users and merging the >> patches should be a benefit for them. > > I don't have much thoughts to offer about that, but this patch is > failing to apply, so a rebase is at least necessary. Yes. I think he is going to post a new patch (possibly with enhancements) soon. Best regards, -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese:http://www.sraoss.co.jp
Re: Implementing Incremental View Maintenance
On Thu, Sep 17, 2020 at 09:42:45AM +0900, Tatsuo Ishii wrote: > I am asking because these patch sets are now getting closer to > committable state in my opinion, and if there's someting wrong, it > should be fixed soon so that these patches are getting into the master > branch. > > I think this feature has been long awaited by users and merging the > patches should be a benefit for them. I don't have much thoughts to offer about that, but this patch is failing to apply, so a rebase is at least necessary. -- Michael signature.asc Description: PGP signature
Re: Implementing Incremental View Maintenance
> I have nothing, I'm just reading starter papers and trying to learn a > bit more about the concepts at this stage. I was thinking of > reviewing some of the more mechanical parts of the patch set, though, > like perhaps the transition table lifetime management, since I have > worked on that area before. Do you have comments on this part? I am asking because these patch sets are now getting closer to committable state in my opinion, and if there's someting wrong, it should be fixed soon so that these patches are getting into the master branch. I think this feature has been long awaited by users and merging the patches should be a benefit for them. Best regards, -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese:http://www.sraoss.co.jp
Re: Implementing Incremental View Maintenance
On Wed, 9 Sep 2020 14:22:28 +1200 Thomas Munro wrote: > > Therefore, usual update semantics (tuple locks and EvalPlanQual) and UPSERT > > can be used for optimization for some classes of view, but we don't have any > > other better idea than using table lock for views joining tables. We would > > appreciate it if you could suggest better solution. > > I have nothing, I'm just reading starter papers and trying to learn a > bit more about the concepts at this stage. I was thinking of > reviewing some of the more mechanical parts of the patch set, though, > like perhaps the transition table lifetime management, since I have > worked on that area before. Thank you for your interrest. It would be greatly appreciated if you could review the patch. > > > (Newer papers describe locking schemes that avoid even aggregate-row > > > level conflicts, by taking advantage of the associativity and > > > commutativity of aggregates like SUM and COUNT. You can allow N > > > writers to update the aggregate concurrently, and if any transaction > > > has to roll back it subtracts what it added, not necessarily restoring > > > the original value, so that nobody conflicts with anyone else, or > > > something like that... Contemplating an MVCC, no-rollbacks version of > > > that sort of thing leads to ideas like, I dunno, update chains > > > containing differential update trees to be compacted later... egad!) > > > > I am interested in papers you mentioned! Are they literatures in context of > > incremental view maintenance? > > Yeah. I was skim-reading some parts of [1] including section 2.5.1 > "Concurrency Control", which opens with some comments about > aggregates, locking and pointers to "V-locking" [2] for high > concurrency aggregates. There is also a pointer to G. Graefe and M. > J. Zwilling, "Transaction support for indexed views," which I haven't > located; apparently indexed views are Graefe's name for MVs, and > apparently this paper has a section on MVCC systems which sounds > interesting for us. > > [1] https://dsf.berkeley.edu/cs286/papers/mv-fntdb2012.pdf > [2] http://pages.cs.wisc.edu/~gangluo/latch.pdf Thanks for your information! I will also check references regarding with IVM and concurrency control. Regards, Yugo Nagata -- Yugo NAGATA
Re: Implementing Incremental View Maintenance
On Wed, Sep 9, 2020 at 12:29 PM Yugo NAGATA wrote: > I also thought it might be resolved using tuple locks and EvalPlanQual > instead of table level lock, but there is still a unavoidable case. For > example, suppose that tuple dR is inserted into R in T1, and dS is inserted > into S in T2. Also, suppose that dR and dS will be joined in according to > the view definition. In this situation, without any lock, the change of V is > computed as dV=dR*S in T1, dV=R*dS in T2, respectively, and dR*dS would not > be included in the results. This causes inconsistency. I don't think this > could be resolved even if we use tuple locks. I see. Thanks for the explanation! > As to aggregate view without join , however, we might be able to use a lock > of more low granularity as you said, because if rows belonging a group in a > table is changes, we just update (or delete) corresponding rows in the view. > Even if there are concurrent transactions updating the same table, we would > be able to make one of them wait using tuple lock. If concurrent transactions > are trying to insert a tuple into the same table, we might need to use unique > index and UPSERT to avoid to insert multiple rows with same group key into > the view. > > Therefore, usual update semantics (tuple locks and EvalPlanQual) and UPSERT > can be used for optimization for some classes of view, but we don't have any > other better idea than using table lock for views joining tables. We would > appreciate it if you could suggest better solution. I have nothing, I'm just reading starter papers and trying to learn a bit more about the concepts at this stage. I was thinking of reviewing some of the more mechanical parts of the patch set, though, like perhaps the transition table lifetime management, since I have worked on that area before. > > (Newer papers describe locking schemes that avoid even aggregate-row > > level conflicts, by taking advantage of the associativity and > > commutativity of aggregates like SUM and COUNT. You can allow N > > writers to update the aggregate concurrently, and if any transaction > > has to roll back it subtracts what it added, not necessarily restoring > > the original value, so that nobody conflicts with anyone else, or > > something like that... Contemplating an MVCC, no-rollbacks version of > > that sort of thing leads to ideas like, I dunno, update chains > > containing differential update trees to be compacted later... egad!) > > I am interested in papers you mentioned! Are they literatures in context of > incremental view maintenance? Yeah. I was skim-reading some parts of [1] including section 2.5.1 "Concurrency Control", which opens with some comments about aggregates, locking and pointers to "V-locking" [2] for high concurrency aggregates. There is also a pointer to G. Graefe and M. J. Zwilling, "Transaction support for indexed views," which I haven't located; apparently indexed views are Graefe's name for MVs, and apparently this paper has a section on MVCC systems which sounds interesting for us. [1] https://dsf.berkeley.edu/cs286/papers/mv-fntdb2012.pdf [2] http://pages.cs.wisc.edu/~gangluo/latch.pdf
Re: Implementing Incremental View Maintenance
Hi Thomas, Thank you for your comment! On Sat, 5 Sep 2020 17:56:18 +1200 Thomas Munro wrote: > + /* > +* Wait for concurrent transactions which update this materialized view at > +* READ COMMITED. This is needed to see changes committed in other > +* transactions. No wait and raise an error at REPEATABLE READ or > +* SERIALIZABLE to prevent update anomalies of matviews. > +* XXX: dead-lock is possible here. > +*/ > + if (!IsolationUsesXactSnapshot()) > + LockRelationOid(matviewOid, ExclusiveLock); > + else if (!ConditionalLockRelationOid(matviewOid, ExclusiveLock)) > > Could you please say a bit more about your plans for concurrency control? > > Simple hand-crafted "rollup" triggers typically conflict only when > modifying the same output rows due to update/insert conflicts, or > perhaps some explicit row level locking if they're doing something > complex (unfortunately, they also very often have concurrency > bugs...). In some initial reading about MV maintenance I did today in > the hope of understanding some more context for this very impressive > but rather intimidating patch set, I gained the impression that > aggregate-row locking granularity is assumed as a baseline for eager > incremental aggregate maintenance. I understand that our > MVCC/snapshot scheme introduces extra problems, but I'm wondering if > these problems can be solved using the usual update semantics (the > EvalPlanQual mechanism), and perhaps also some UPSERT logic. Why is > it not sufficient to have locked all the base table rows that you have > modified, captured the before-and-after values generated by those > updates, and also locked all the IMV aggregate rows you will read, and > in the process acquired a view of the latest committed state of the > IMV aggregate rows you will modify (possibly having waited first)? In > other words, what other data do you look at, while computing the > incremental update, that might suffer from anomalies because of > snapshots and concurrency? For one thing, I am aware that unique > indexes for groups would probably be necessary; perhaps some subtle > problems of the sort usually solved with predicate locks lurk there? I decided to lock a matview considering views joining tables. For example, let V = R*S is an incrementally maintainable materialized view which joins tables R and S. Suppose there are two concurrent transactions T1 which changes table R to R' and T2 which changes S to S'. Without any lock, in READ COMMITTED mode, V would be updated to represent V=R'*S in T1, and V=R*S' in T2, so it would cause inconsistency. By locking the view V, transactions T1, T2 are processed serially and this inconsistency can be avoided. I also thought it might be resolved using tuple locks and EvalPlanQual instead of table level lock, but there is still a unavoidable case. For example, suppose that tuple dR is inserted into R in T1, and dS is inserted into S in T2. Also, suppose that dR and dS will be joined in according to the view definition. In this situation, without any lock, the change of V is computed as dV=dR*S in T1, dV=R*dS in T2, respectively, and dR*dS would not be included in the results. This causes inconsistency. I don't think this could be resolved even if we use tuple locks. As to aggregate view without join , however, we might be able to use a lock of more low granularity as you said, because if rows belonging a group in a table is changes, we just update (or delete) corresponding rows in the view. Even if there are concurrent transactions updating the same table, we would be able to make one of them wait using tuple lock. If concurrent transactions are trying to insert a tuple into the same table, we might need to use unique index and UPSERT to avoid to insert multiple rows with same group key into the view. Therefore, usual update semantics (tuple locks and EvalPlanQual) and UPSERT can be used for optimization for some classes of view, but we don't have any other better idea than using table lock for views joining tables. We would appreciate it if you could suggest better solution. > (Newer papers describe locking schemes that avoid even aggregate-row > level conflicts, by taking advantage of the associativity and > commutativity of aggregates like SUM and COUNT. You can allow N > writers to update the aggregate concurrently, and if any transaction > has to roll back it subtracts what it added, not necessarily restoring > the original value, so that nobody conflicts with anyone else, or > something like that... Contemplating an MVCC, no-rollbacks version of > that sort of thing leads to ideas like, I dunno, update chains > containing differential update trees to be compacted later... egad!) I am interested in papers you mentioned! Are they literatures in context of incremental view maintenance? Regards, Yugo Nagata -- Yugo NAGATA
Re: Implementing Incremental View Maintenance
Hi Nagata-san, On Mon, Aug 31, 2020 at 5:32 PM Yugo NAGATA wrote: > https://wiki.postgresql.org/wiki/Incremental_View_Maintenance Thanks for writing this! + /* +* Wait for concurrent transactions which update this materialized view at +* READ COMMITED. This is needed to see changes committed in other +* transactions. No wait and raise an error at REPEATABLE READ or +* SERIALIZABLE to prevent update anomalies of matviews. +* XXX: dead-lock is possible here. +*/ + if (!IsolationUsesXactSnapshot()) + LockRelationOid(matviewOid, ExclusiveLock); + else if (!ConditionalLockRelationOid(matviewOid, ExclusiveLock)) Could you please say a bit more about your plans for concurrency control? Simple hand-crafted "rollup" triggers typically conflict only when modifying the same output rows due to update/insert conflicts, or perhaps some explicit row level locking if they're doing something complex (unfortunately, they also very often have concurrency bugs...). In some initial reading about MV maintenance I did today in the hope of understanding some more context for this very impressive but rather intimidating patch set, I gained the impression that aggregate-row locking granularity is assumed as a baseline for eager incremental aggregate maintenance. I understand that our MVCC/snapshot scheme introduces extra problems, but I'm wondering if these problems can be solved using the usual update semantics (the EvalPlanQual mechanism), and perhaps also some UPSERT logic. Why is it not sufficient to have locked all the base table rows that you have modified, captured the before-and-after values generated by those updates, and also locked all the IMV aggregate rows you will read, and in the process acquired a view of the latest committed state of the IMV aggregate rows you will modify (possibly having waited first)? In other words, what other data do you look at, while computing the incremental update, that might suffer from anomalies because of snapshots and concurrency? For one thing, I am aware that unique indexes for groups would probably be necessary; perhaps some subtle problems of the sort usually solved with predicate locks lurk there? (Newer papers describe locking schemes that avoid even aggregate-row level conflicts, by taking advantage of the associativity and commutativity of aggregates like SUM and COUNT. You can allow N writers to update the aggregate concurrently, and if any transaction has to roll back it subtracts what it added, not necessarily restoring the original value, so that nobody conflicts with anyone else, or something like that... Contemplating an MVCC, no-rollbacks version of that sort of thing leads to ideas like, I dunno, update chains containing differential update trees to be compacted later... egad!)
Re: Implementing Incremental View Maintenance
Hi, I updated the wiki page. https://wiki.postgresql.org/wiki/Incremental_View_Maintenance On Fri, 21 Aug 2020 21:40:50 +0900 (JST) Tatsuo Ishii wrote: > From: Yugo NAGATA > Subject: Re: Implementing Incremental View Maintenance > Date: Fri, 21 Aug 2020 17:23:20 +0900 > Message-ID: <20200821172320.a2506577d5244b6066f69...@sraoss.co.jp> > > > On Wed, 19 Aug 2020 10:02:42 +0900 (JST) > > Tatsuo Ishii wrote: > > > >> I have looked into this. > > > > Thank you for your reviewing! > > > >> - 0004-Allow-to-prolong-life-span-of-transition-tables-unti.patch: > >> This one needs a comment to describe what the function does etc. > >> > >> +void > >> +SetTransitionTablePreserved(Oid relid, CmdType cmdType) > >> +{ > > > > I added a comment for this function and related places. > > > > +/* > > + * SetTransitionTablePreserved > > + * > > + * Prolong lifespan of transition tables corresponding specified relid and > > + * command type to the end of the outmost query instead of each nested > > query. > > + * This enables to use nested AFTER trigger's transition tables from outer > > + * query's triggers. Currently, only immediate incremental view > > maintenance > > + * uses this. > > + */ > > +void > > +SetTransitionTablePreserved(Oid relid, CmdType cmdType) > > > > Also, I removed releted unnecessary code which was left accidentally. > > > > > >> - 0007-Add-aggregates-support-in-IVM.patch > >> "Check if the given aggregate function is supporting" shouldn't be > >> "Check if the given aggregate function is supporting IVM"? > > > > Yes, you are right. I fixed this, too. > > > >> > >> + * check_aggregate_supports_ivm > >> + * > >> + * Check if the given aggregate function is supporting > > Thanks for the fixes. I have changed the commit fest status to "Ready > for Committer". > > Best regards, > -- > Tatsuo Ishii > SRA OSS, Inc. Japan > English: http://www.sraoss.co.jp/index_en.php > Japanese:http://www.sraoss.co.jp > > -- Yugo NAGATA
Re: Implementing Incremental View Maintenance
From: Yugo NAGATA Subject: Re: Implementing Incremental View Maintenance Date: Fri, 21 Aug 2020 17:23:20 +0900 Message-ID: <20200821172320.a2506577d5244b6066f69...@sraoss.co.jp> > On Wed, 19 Aug 2020 10:02:42 +0900 (JST) > Tatsuo Ishii wrote: > >> I have looked into this. > > Thank you for your reviewing! > >> - 0004-Allow-to-prolong-life-span-of-transition-tables-unti.patch: >> This one needs a comment to describe what the function does etc. >> >> +void >> +SetTransitionTablePreserved(Oid relid, CmdType cmdType) >> +{ > > I added a comment for this function and related places. > > +/* > + * SetTransitionTablePreserved > + * > + * Prolong lifespan of transition tables corresponding specified relid and > + * command type to the end of the outmost query instead of each nested query. > + * This enables to use nested AFTER trigger's transition tables from outer > + * query's triggers. Currently, only immediate incremental view maintenance > + * uses this. > + */ > +void > +SetTransitionTablePreserved(Oid relid, CmdType cmdType) > > Also, I removed releted unnecessary code which was left accidentally. > > >> - 0007-Add-aggregates-support-in-IVM.patch >> "Check if the given aggregate function is supporting" shouldn't be >> "Check if the given aggregate function is supporting IVM"? > > Yes, you are right. I fixed this, too. > >> >> + * check_aggregate_supports_ivm >> + * >> + * Check if the given aggregate function is supporting Thanks for the fixes. I have changed the commit fest status to "Ready for Committer". Best regards, -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese:http://www.sraoss.co.jp
Re: Implementing Incremental View Maintenance
On Wed, 19 Aug 2020 10:02:42 +0900 (JST) Tatsuo Ishii wrote: > I have looked into this. Thank you for your reviewing! > - 0004-Allow-to-prolong-life-span-of-transition-tables-unti.patch: > This one needs a comment to describe what the function does etc. > > +void > +SetTransitionTablePreserved(Oid relid, CmdType cmdType) > +{ I added a comment for this function and related places. +/* + * SetTransitionTablePreserved + * + * Prolong lifespan of transition tables corresponding specified relid and + * command type to the end of the outmost query instead of each nested query. + * This enables to use nested AFTER trigger's transition tables from outer + * query's triggers. Currently, only immediate incremental view maintenance + * uses this. + */ +void +SetTransitionTablePreserved(Oid relid, CmdType cmdType) Also, I removed releted unnecessary code which was left accidentally. > - 0007-Add-aggregates-support-in-IVM.patch > "Check if the given aggregate function is supporting" shouldn't be > "Check if the given aggregate function is supporting IVM"? Yes, you are right. I fixed this, too. > > + * check_aggregate_supports_ivm > + * > + * Check if the given aggregate function is supporting Regards, Yugo Nagata -- Yugo NAGATA IVM_patches_v17.tar.gz Description: application/gzip
Re: Implementing Incremental View Maintenance
I have looked into this. > Hi, > > Attached is the rebased patch (v16) to add support for Incremental > Materialized View Maintenance (IVM). It is able to be applied to > current latest master branch. > > This also includes the following small fixes: > > - Add a query check for expressions containing aggregates in it > - [doc] Add description about which situations IVM is effective or not in > - Improve hint in log messages > - Reorganize include directives in codes - make check passed. - make check-world passed. - 0004-Allow-to-prolong-life-span-of-transition-tables-unti.patch: This one needs a comment to describe what the function does etc. +void +SetTransitionTablePreserved(Oid relid, CmdType cmdType) +{ - 0007-Add-aggregates-support-in-IVM.patch "Check if the given aggregate function is supporting" shouldn't be "Check if the given aggregate function is supporting IVM"? + * check_aggregate_supports_ivm + * + * Check if the given aggregate function is supporting Best regards, -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese:http://www.sraoss.co.jp
Re: Implementing Incremental View Maintenance
Hi, Attached is the rebased patch (v16) to add support for Incremental Materialized View Maintenance (IVM). It is able to be applied to current latest master branch. This also includes the following small fixes: - Add a query check for expressions containing aggregates in it - [doc] Add description about which situations IVM is effective or not in - Improve hint in log messages - Reorganize include directives in codes Regards, Yugo Nagata -- Yugo NAGATA IVM_patches_v16.tar.gz Description: application/gzip
Re: Implementing Incremental View Maintenance
On Tue, Jul 7, 2020 at 3:26 PM Tatsuo Ishii wrote: > >> Query checks for following restrictions are added: > > > > > > Are all known supported cases listed below? > > They are "restrictions" and are not supported. > Yes, I missed the "not" word:( > > >> - inheritance parent table > >> ... > >> - targetlist containing IVM column > >> - simple subquery is only supported > >> > > > > How to understand 3 items above? > > The best way to understand them is looking into regression test. > Thanks for sharing, I will look into it. -- Best Regards Andy Fan
Re: Implementing Incremental View Maintenance
>> Query checks for following restrictions are added: > > > Are all known supported cases listed below? They are "restrictions" and are not supported. > >> - inheritance parent table >> ... >> - targetlist containing IVM column >> - simple subquery is only supported >> > > How to understand 3 items above? The best way to understand them is looking into regression test. src/test/regress/expected/incremental_matview.out. >> - inheritance parent table -- inheritance parent is not supported with IVM" BEGIN; CREATE TABLE parent (i int, v int); CREATE TABLE child_a(options text) INHERITS(parent); CREATE INCREMENTAL MATERIALIZED VIEW mv_ivm21 AS SELECT * FROM parent; ERROR: inheritance parent is not supported on incrementally maintainable materialized view >> - targetlist containing IVM column -- tartget list cannot contain ivm clumn that start with '__ivm' CREATE INCREMENTAL MATERIALIZED VIEW mv_ivm28 AS SELECT i AS "__ivm_count__" FROM mv_base_a; ERROR: column name __ivm_count__ is not supported on incrementally maintainable materialized view >> - simple subquery is only supported -- subquery is not supported with outer join CREATE INCREMENTAL MATERIALIZED VIEW mv(a,b) AS SELECT a.i, b.i FROM mv_base_a a LEFT JOIN (SELECT * FROM mv_base_b) b ON a.i=b.i; ERROR: this query is not allowed on incrementally maintainable materialized view HINT: subquery is not supported with outer join Best regards, -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese:http://www.sraoss.co.jp
Re: Implementing Incremental View Maintenance
Thanks for the patch! > Query checks for following restrictions are added: Are all known supported cases listed below? > - inheritance parent table > ... > - targetlist containing IVM column > - simple subquery is only supported > How to understand 3 items above? - Best Regards Andy Fan
Re: Implementing Incremental View Maintenance
>> +1, This is a smart idea. How did you test it? AFAIK, we can test it > with: > > 1. For any query like SELECT xxx, we create view like CREATE MATERIAL VIEW > mv_name as SELECT xxx; to test if the features in the query are supported. No I didn't test the correctness of IVM with TPC-DS data for now. TPC-DS comes with a data generator and we can test IVM something like: SELECT * FROM IVM_vew EXCEPT SELECT ... (TPC-DS original query); If this produces 0 row, then the IVM is correct for the initial data. (of course actually we need to add appropreate ORDER BY and LIMIT clause to the SELECT statement for IVM if neccessary). > 2. Update the data and then compare the result with SELECT XXX with SELECT > * from mv_name to test if the data is correctly sync. I wanted to test the data updating but I am still struggling how to extract correct updating data from TPC-DS data set. Best regards, -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese:http://www.sraoss.co.jp
Re: Implementing Incremental View Maintenance
On Fri, May 8, 2020 at 9:13 AM Tatsuo Ishii wrote: > >> Hi, > >> > >> Attached is the latest patch (v15) to add support for Incremental > Materialized > >> View Maintenance (IVM). It is possible to apply to current latest > master branch. > > I have tried to use IVM against TPC-DS (http://www.tpc.org/tpcds/) > queries. TPC-DS models decision support systems and those queries are > modestly complex. So I thought applying IVM to those queries could > show how IVM covers real world queries. > > +1, This is a smart idea. How did you test it? AFAIK, we can test it with: 1. For any query like SELECT xxx, we create view like CREATE MATERIAL VIEW mv_name as SELECT xxx; to test if the features in the query are supported. 2. Update the data and then compare the result with SELECT XXX with SELECT * from mv_name to test if the data is correctly sync. Best Regards Andy Fan
Re: Implementing Incremental View Maintenance
>> Hi, >> >> Attached is the latest patch (v15) to add support for Incremental >> Materialized >> View Maintenance (IVM). It is possible to apply to current latest master >> branch. I have tried to use IVM against TPC-DS (http://www.tpc.org/tpcds/) queries. TPC-DS models decision support systems and those queries are modestly complex. So I thought applying IVM to those queries could show how IVM covers real world queries. Since IVM does not support queries including ORDER BY and LIMIT, I removed them from the queries before the test. Here are some facts so far learned in this attempt. - Number of TPC-DS query files is 99. - IVM was successfully applied to 20 queries. - 33 queries failed because they use WITH clause (CTE) (currenly IVM does not support CTE). - Error messages from failed queries (except those using WITH) are below: (the number indicates how many queries failed by the same reason) 11 aggregate functions in nested query are not supported on incrementally maintainable materialized view 8window functions are not supported on incrementally maintainable materialized view 7UNION/INTERSECT/EXCEPT statements are not supported on incrementally maintainable materialized view 5WHERE clause only support subquery with EXISTS clause 3GROUPING SETS, ROLLUP, or CUBE clauses is not supported on incrementally maintainable materialized view 3aggregate function and EXISTS condition are not supported at the same time 2GROUP BY expression not appeared in select list is not supported on incrementally maintainable materialized view 2aggregate function with DISTINCT arguments is not supported on incrementally maintainable materialized view 2aggregate is not supported with outer join 1aggregate function stddev_samp(integer) is not supported on incrementally maintainable materialized view 1HAVING clause is not supported on incrementally maintainable materialized view 1subquery is not supported with outer join 1 column "avg" specified more than once Attached are the queries IVM are successfully applied. Best regards, -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese:http://www.sraoss.co.jp IVM_sucessfull_queries.tar.gz Description: Binary data