Re: documentation on HOT
On Fri, Jul 22, 2022 at 03:33:20PM -0700, Peter Geoghegan wrote: > On Fri, Jul 22, 2022 at 2:11 PM Bruce Momjian wrote: > > I have improved the wording of the last paragraph in this patch. > > I think that it would be worth prominently explaining where heap-only > tuples get their name from: it comes from the fact there are (by > definition) no entries for a heap-only tuple in any index, ever. > Indexes are nevertheless capable of locating heap-only tuples during > index scans, by dealing with a little additional indirection: they > must traverse groups of related tuple versions, all for the same > logical row that was HOT updated one or more times -- this group of > related tuples is called a HOT chain. > > This seems like a useful thing to emphasize because it places the > emphasis on what *doesn't* happen. Mostly what doesn't happen in > indexes. > > New item identifiers actually *are* needed for heap-only tuples > (perhaps we could get away with it, but we don't). However, that > doesn't really matter too much in practice. Heap-only tuples can still > have their line pointers set to LP_UNUSED directly during pruning, > without having to be set to LP_DEAD for a time first (a situation > which VACUUM alone can correct by setting the LP_DEAD items to > LP_UNUSED during its second heap pass). > > So heap-only tuples "skip the step" where they have to become LP_DEAD > stubs/tombstones. Which is possible precisely because indexes don't > need to be considered (they're "heap-only"). Good points. I have updated the attached patch and URL to mention that HOT rows are _completely_ removed, and why that is possible, and I clarified the page item identifier mention. -- Bruce Momjian https://momjian.us EDB https://enterprisedb.com Indecision is a decision. Inaction is an action. Mark Batterson diff --git a/doc/src/sgml/acronyms.sgml b/doc/src/sgml/acronyms.sgml index 9ed148ab84..2df6559acc 100644 --- a/doc/src/sgml/acronyms.sgml +++ b/doc/src/sgml/acronyms.sgml @@ -299,9 +299,7 @@ HOT - https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/access/heap/README.HOT;hb=HEAD";>Heap-Only - Tuples + Heap-Only Tuples diff --git a/doc/src/sgml/btree.sgml b/doc/src/sgml/btree.sgml index a9200ee52e..6f608a14bf 100644 --- a/doc/src/sgml/btree.sgml +++ b/doc/src/sgml/btree.sgml @@ -639,7 +639,8 @@ options(relopts local_relopts *) returns accumulate and adversely affect query latency and throughput. This typically occurs with UPDATE-heavy workloads where most individual updates cannot apply the - HOT optimization. Changing the value of only + HOT optimization. + Changing the value of only one column covered by one index during an UPDATE always necessitates a new set of index tuples — one for each and every index on the diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml index a186e35f00..248dbc0e26 100644 --- a/doc/src/sgml/catalogs.sgml +++ b/doc/src/sgml/catalogs.sgml @@ -4381,7 +4381,7 @@ SCRAM-SHA-256$:&l If true, queries must not use the index until the xmin of this pg_index row is below their TransactionXmin - event horizon, because the table may contain broken HOT chains with + event horizon, because the table may contain broken HOT chains with incompatible rows that they can see diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index e2d728e0c4..e5a84ed76d 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -4482,7 +4482,8 @@ ANY num_sync ( HOT updates +will defer cleanup of dead row versions. The default is zero transactions, meaning that dead row versions can be removed as soon as possible, that is, as soon as they are no longer visible to any open transaction. You may wish to set this to a diff --git a/doc/src/sgml/indexam.sgml b/doc/src/sgml/indexam.sgml index cf359fa9ff..4f83970c85 100644 --- a/doc/src/sgml/indexam.sgml +++ b/doc/src/sgml/indexam.sgml @@ -45,7 +45,8 @@ extant versions of the same logical row; to an index, each tuple is an independent object that needs its own index entry. Thus, an update of a row always creates all-new index entries for the row, even if - the key values did not change. (HOT tuples are an exception to this + the key values did not change. (HOT + tuples are an exception to this statement; but indexes do not deal with those, either.) Index entries for dead tuples are reclaimed (by vacuuming) when the dead tuples themselves are reclaimed. diff --git a/doc/src/sgml/indices.sgml b/doc/src/sgml/indices.sgml index 023157d888..42e1e86c8a 100644 --- a/doc/src/sgml/indices.sgml +++ b/doc/src/sgml/indices.sgml @@ -749,7 +749,7 @@ CREATE INDEX people_names ON people ((first_name || ' ' || last_name)); Index expressions are relati
Re: CREATE INDEX USING documentation
On Sat, Jan 15, 2022 at 11:01:58PM +, PG Doc comments form wrote: > The following documentation comment has been logged on the website: > > Page: https://www.postgresql.org/docs/14/sql-createindex.html > Description: > > In the CREATE INDEX > synopsis(https://www.postgresql.org/docs/current/sql-createindex.html), the > USING method parameter is listed, however in the PARAMETERS section USING is > never fully described, nor is there a link to the very help 11.2 Index Types > page (https://www.postgresql.org/docs/14/indexes-types.html). Well, that is certainly odd. The attached patch adds a description of the option, and a link to the chapter you suggested. > On a related note, the CREATE INDEX page provides the following blurb: > > "Prior releases of PostgreSQL also had an R-tree index method. This method > has been removed because it had no significant advantages over the GiST > method. If USING rtree is specified, CREATE INDEX will interpret it as USING > gist, to simplify conversion of old databases to GiST." > > But there is no such blurb on the 11.2 Index Types page. We think of backward compatibility stuff as operational, meaning it appears in the command manual page but not necessarily in the main docs. -- Bruce Momjian https://momjian.us EDB https://enterprisedb.com Indecision is a decision. Inaction is an action. Mark Batterson diff --git a/doc/src/sgml/ref/create_index.sgml b/doc/src/sgml/ref/create_index.sgml index a5bac9f737..944fdb4b09 100644 --- a/doc/src/sgml/ref/create_index.sgml +++ b/doc/src/sgml/ref/create_index.sgml @@ -149,6 +149,18 @@ CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ] [ [ IF NOT EXISTS ] + + USING + + +The optional USING clause specifies an index +type as described in . If not +specified, a default index type will be used based on the +data types of the columns. + + + + INCLUDE
Re: documentation on HOT
On Sat, Jul 23, 2022 at 8:51 AM Bruce Momjian wrote: > Good points. I have updated the attached patch and URL to mention that > HOT rows are _completely_ removed, and why that is possible, and I > clarified the page item identifier mention. I think that this version looks very good, but I do have some minor notes: * You wrote "Specifically, updates cause additional rows to be added to tables." Perhaps this could be rephrased: "Specifically, updates add new physical tuples to tables to represent each new version." I think that the term "row" should only refer to the simple/abstract idea of a row from a table, while the term tuple should be preferred when referring to a physical embodiment of a row, like one version of a row. Perhaps it's worth following that convention across the board here (not just in this sentence that I have highlighted). * You wrote "This can also require new index entries for each updated row, and removal of old versions of rows can be expensive" I believe that the operative word in this sentence (which appears in the first paragraph) is "can". I think that it would be good to go just a bit further with that. Maybe add another sentence immediately afterwards that conveys "and now we're going to discuss when and how new versions from updates can sometimes avoid the need for a new round of index entries". * You wrote "New index entries are not needed to represent updated rows" It seems to me that this undersells the key benefit. You could perhaps add another sentence. Something like: "This avoids the immediate cost of adding new successor versions to each and every index, and avoids the cost of removing the obsolete versions from each and every index later on." * You refer to opportunistic pruning as something that happens "during normal operation", but that doesn't seem to get the idea of "opportunistic" across. It seems like it would be worth writing a sentence or two more on this, just to get that aspect across. Opportunistic cleanup occurs when a query happens to notice that a heap page that it had to read as part of query processing needed to be cleaned up in passing. We do it there and then because it happens to be relatively cheap and convenient to do it that way. That sort of thing. Overall, I think that this is suitable to commit, and I don't want to make too much of a fuss. It's great that we're doing this. Thanks -- Peter Geoghegan
Re: documentation on HOT
On Sat, Jul 23, 2022 at 11:34 AM Peter Geoghegan wrote: > On Sat, Jul 23, 2022 at 8:51 AM Bruce Momjian wrote: > > Good points. I have updated the attached patch and URL to mention that > > HOT rows are _completely_ removed, and why that is possible, and I > > clarified the page item identifier mention. > > I think that this version looks very good, but I do have some minor notes: > > * You wrote "Specifically, updates cause additional rows to be added to > tables." > > Perhaps this could be rephrased: "Specifically, updates add new > physical tuples to tables to represent each new version." > > I think that the term "row" should only refer to the simple/abstract > idea of a row from a table, while the term tuple should be preferred > when referring to a physical embodiment of a row, like one version of > a row. Perhaps it's worth following that convention across the board > here (not just in this sentence that I have highlighted). > I concur, suggesting the following: "Specifically, updates result in multiple rows versions (tuples) existing on the table." "There is sufficient free space on the page containing the old tuple for the updated tuple." "Old tuples can be completely removed..." > Overall, I think that this is suitable to commit, and I don't want to > make too much of a fuss. It's great that we're doing this. > > Agreed. The other suggestion listed are not clear-cut winners in my mind. The following, though, seems to just come out of nowhere. It would be better setup as a "(See for why this is possible.)" instead of dropping "page item identifiers" on the reader. + This removal is possible because indexes + do not reference their page + item identifiers. As a related thought, this has done a great job of being usable for a DBA operating at a high-level of system knowledge and interaction. I don't think burying it in storage.sgml is desirable, Maybe "Performance Tips" under "Avoid Unnecessary Indexes" (yes, a bit of a stretch, but nothing else seems to fit better, except maybe in concurrency control since we are discussing overcoming the limitation of our concurrency control choice. Summary paragraph: "can only happen if" => "can only be created if" David J.
Re: gettext instructions for Windows seems wrong
> On 22 Jul 2022, at 16:32, Bruce Momjian wrote: > > On Wed, Jan 19, 2022 at 10:24:57AM +0100, Daniel Gustafsson wrote: >>> On 17 Jan 2022, at 23:55, PG Doc comments form >>> wrote: >> >>> The instructions say that I have to edit config.pl (actually, I had to >>> create it).. >> >> I know next to nothing about NLS on Windows, but regardless of doc updates >> for >> this I think we should clarify that config.pl may not exist, like how we do >> for >> buildenv.pl. > > Good point --- I have developed the attached patch to implement this. Thanks, LGTM -- Daniel Gustafsson https://vmware.com/
pg document is not recorded:maintenance_work_mem can not large than 2GB in windows
pg document is not recorded:maintenance_work_mem can not large than 2GB in windows。 https://www.postgresql.org/docs/14/runtime-config-resource.html I suggest that:modify maintenance_work_mem parameter description in the pg documents: add maintenance_work_mem parameter size limit in windows platform if you set maintenance_work_mem large that 2GB in windows platform postgresql service can not start。 in the windows event viewer ,thare is the following error: "ERROR: 3072000 is outside the valid range for parameter "maintenance_work_mem" (64 .. 2097151) and the work_mem parameter also has similar problems as above。 https://www.postgresql.org/message-id/17895.1315869...@sss.pgh.pa.us so, It is best to add this description to the document about work_mem and maintenance_work_mem parameter