date:20220723

Re: documentation on HOT

2022-07-23 Thread Bruce Momjian

On Fri, Jul 22, 2022 at 03:33:20PM -0700, Peter Geoghegan wrote:
> On Fri, Jul 22, 2022 at 2:11 PM Bruce Momjian  wrote:
> > I have improved the wording of the last paragraph in this patch.
> 
> I think that it would be worth prominently explaining where heap-only
> tuples get their name from: it comes from the fact there are (by
> definition) no entries for a heap-only tuple in any index, ever.
> Indexes are nevertheless capable of locating heap-only tuples during
> index scans, by dealing with a little additional indirection: they
> must traverse groups of related tuple versions, all for the same
> logical row that was HOT updated one or more times -- this group of
> related tuples is called a HOT chain.
> 
> This seems like a useful thing to emphasize because it places the
> emphasis on what *doesn't* happen. Mostly what doesn't happen in
> indexes.
> 
> New item identifiers actually *are* needed for heap-only tuples
> (perhaps we could get away with it, but we don't). However, that
> doesn't really matter too much in practice. Heap-only tuples can still
> have their line pointers set to LP_UNUSED directly during pruning,
> without having to be set to LP_DEAD for a time first (a situation
> which VACUUM alone can correct by setting the LP_DEAD items to
> LP_UNUSED during its second heap pass).
> 
> So heap-only tuples "skip the step" where they have to become LP_DEAD
> stubs/tombstones. Which is possible precisely because indexes don't
> need to be considered (they're "heap-only").

Good points.  I have updated the attached patch and URL to mention that
HOT rows are _completely_ removed, and why that is possible, and I
clarified the page item identifier mention.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Indecision is a decision.  Inaction is an action.  Mark Batterson

diff --git a/doc/src/sgml/acronyms.sgml b/doc/src/sgml/acronyms.sgml
index 9ed148ab84..2df6559acc 100644
--- a/doc/src/sgml/acronyms.sgml
+++ b/doc/src/sgml/acronyms.sgml
@@ -299,9 +299,7 @@
 HOT
 
  
-  https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/access/heap/README.HOT;hb=HEAD";>Heap-Only
-  Tuples
+  Heap-Only Tuples
  
 

diff --git a/doc/src/sgml/btree.sgml b/doc/src/sgml/btree.sgml
index a9200ee52e..6f608a14bf 100644
--- a/doc/src/sgml/btree.sgml
+++ b/doc/src/sgml/btree.sgml
@@ -639,7 +639,8 @@ options(relopts local_relopts *) returns
accumulate and adversely affect query latency and throughput.  This
typically occurs with UPDATE-heavy workloads
where most individual updates cannot apply the
-   HOT optimization.  Changing the value of only
+   HOT optimization.
+   Changing the value of only
one column covered by one index during an UPDATE
always necessitates a new set of index tuples
— one for each and every index on the
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index a186e35f00..248dbc0e26 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -4381,7 +4381,7 @@ SCRAM-SHA-256$:&l
   
If true, queries must not use the index until the xmin
of this pg_index row is below their TransactionXmin
-   event horizon, because the table may contain broken HOT chains with
+   event horizon, because the table may contain broken HOT chains with
incompatible rows that they can see
   
  
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index e2d728e0c4..e5a84ed76d 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4482,7 +4482,8 @@ ANY num_sync ( HOT updates
+will defer cleanup of dead row versions. The
 default is zero transactions, meaning that dead row versions can be
 removed as soon as possible, that is, as soon as they are no longer
 visible to any open transaction.  You may wish to set this to a
diff --git a/doc/src/sgml/indexam.sgml b/doc/src/sgml/indexam.sgml
index cf359fa9ff..4f83970c85 100644
--- a/doc/src/sgml/indexam.sgml
+++ b/doc/src/sgml/indexam.sgml
@@ -45,7 +45,8 @@
extant versions of the same logical row; to an index, each tuple is
an independent object that needs its own index entry.  Thus, an
update of a row always creates all-new index entries for the row, even if
-   the key values did not change.  (HOT tuples are an exception to this
+   the key values did not change.  (HOT
+   tuples are an exception to this
statement; but indexes do not deal with those, either.)  Index entries for
dead tuples are reclaimed (by vacuuming) when the dead tuples themselves
are reclaimed.
diff --git a/doc/src/sgml/indices.sgml b/doc/src/sgml/indices.sgml
index 023157d888..42e1e86c8a 100644
--- a/doc/src/sgml/indices.sgml
+++ b/doc/src/sgml/indices.sgml
@@ -749,7 +749,7 @@ CREATE INDEX people_names ON people ((first_name || ' ' || last_name));
   
Index expressions are relati

Re: CREATE INDEX USING documentation

2022-07-23 Thread Bruce Momjian

On Sat, Jan 15, 2022 at 11:01:58PM +, PG Doc comments form wrote:
> The following documentation comment has been logged on the website:
> 
> Page: https://www.postgresql.org/docs/14/sql-createindex.html
> Description:
> 
> In the CREATE INDEX
> synopsis(https://www.postgresql.org/docs/current/sql-createindex.html), the
> USING method parameter is listed, however in the PARAMETERS section USING is
> never fully described, nor is there a link to the very help 11.2 Index Types
> page (https://www.postgresql.org/docs/14/indexes-types.html).  

Well, that is certainly odd.  The attached patch adds a description of
the option, and a link to the chapter you suggested.

> On a related note, the CREATE INDEX page provides the following blurb: 
> 
> "Prior releases of PostgreSQL also had an R-tree index method. This method
> has been removed because it had no significant advantages over the GiST
> method. If USING rtree is specified, CREATE INDEX will interpret it as USING
> gist, to simplify conversion of old databases to GiST."
> 
> But there is no such blurb on the 11.2 Index Types page. 

We think of backward compatibility stuff as operational, meaning it
appears in the command manual page but not necessarily in the main docs.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Indecision is a decision.  Inaction is an action.  Mark Batterson

diff --git a/doc/src/sgml/ref/create_index.sgml b/doc/src/sgml/ref/create_index.sgml
index a5bac9f737..944fdb4b09 100644
--- a/doc/src/sgml/ref/create_index.sgml
+++ b/doc/src/sgml/ref/create_index.sgml
@@ -149,6 +149,18 @@ CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ] [ [ IF NOT EXISTS ] 
  
 
+ 
+  USING
+  
+   
+The optional USING clause specifies an index
+type as described in .  If not
+specified, a default index type will be used based on the
+data types of the columns.
+   
+  
+ 
+
  
   INCLUDE

Re: documentation on HOT

2022-07-23 Thread Peter Geoghegan

On Sat, Jul 23, 2022 at 8:51 AM Bruce Momjian  wrote:
> Good points.  I have updated the attached patch and URL to mention that
> HOT rows are _completely_ removed, and why that is possible, and I
> clarified the page item identifier mention.

I think that this version looks very good, but I do have some minor notes:

* You wrote "Specifically, updates cause additional rows to be added to tables."

Perhaps this could be rephrased: "Specifically, updates add new
physical tuples to tables to represent each new version."

I think that the term "row" should only refer to the simple/abstract
idea of a row from a table, while the term tuple should be preferred
when referring to a physical embodiment of a row, like one version of
a row. Perhaps it's worth following that convention across the board
here (not just in this sentence that I have highlighted).

* You wrote "This can also require new index entries for each updated
row, and removal of old versions of rows can be expensive"

I believe that the operative word in this sentence (which appears in
the first paragraph) is "can". I think that it would be good to go
just a bit further with that. Maybe add another sentence immediately
afterwards that conveys "and now we're going to discuss when and how
new versions from updates can sometimes avoid the need for a new round
of index entries".

* You wrote "New index entries are not needed to represent updated rows"

It seems to me that this undersells the key benefit. You could perhaps
add another sentence. Something like: "This avoids the immediate cost
of adding new successor versions to each and every index, and avoids
the cost of removing the obsolete versions from each and every index
later on."

* You refer to opportunistic pruning as something that happens "during
normal operation", but that doesn't seem to get the idea of
"opportunistic" across.

It seems like it would be worth writing a sentence or two more on
this, just to get that aspect across. Opportunistic cleanup occurs
when a query happens to notice that a heap page that it had to read as
part of query processing needed to be cleaned up in passing. We do it
there and then because it happens to be relatively cheap and
convenient to do it that way. That sort of thing.

Overall, I think that this is suitable to commit, and I don't want to
make too much of a fuss. It's great that we're doing this.

Thanks
-- 
Peter Geoghegan

Re: documentation on HOT

2022-07-23 Thread David G. Johnston

On Sat, Jul 23, 2022 at 11:34 AM Peter Geoghegan  wrote:

> On Sat, Jul 23, 2022 at 8:51 AM Bruce Momjian  wrote:
> > Good points.  I have updated the attached patch and URL to mention that
> > HOT rows are _completely_ removed, and why that is possible, and I
> > clarified the page item identifier mention.
>
> I think that this version looks very good, but I do have some minor notes:
>
> * You wrote "Specifically, updates cause additional rows to be added to
> tables."
>
> Perhaps this could be rephrased: "Specifically, updates add new
> physical tuples to tables to represent each new version."
>
> I think that the term "row" should only refer to the simple/abstract
> idea of a row from a table, while the term tuple should be preferred
> when referring to a physical embodiment of a row, like one version of
> a row. Perhaps it's worth following that convention across the board
> here (not just in this sentence that I have highlighted).
>

I concur, suggesting the following:

"Specifically, updates result in multiple rows versions (tuples) existing
on the table."

"There is sufficient free space on the page containing the old tuple for
the updated tuple."

"Old tuples can be completely removed..."

> Overall, I think that this is suitable to commit, and I don't want to
> make too much of a fuss. It's great that we're doing this.
>
>
Agreed.  The other suggestion listed are not clear-cut winners in my mind.

The following, though, seems to just come out of nowhere.  It would be
better setup as a "(See  for why this is possible.)" instead of
dropping "page item identifiers" on the reader.

+ This removal is possible because indexes
+ do not reference their page
+ item identifiers.

As a related thought, this has done a great job of being usable for a DBA
operating at a high-level of system knowledge and interaction.  I don't
think burying it in storage.sgml is desirable,  Maybe "Performance Tips"
under "Avoid Unnecessary Indexes" (yes, a bit of a stretch, but nothing
else seems to fit better, except maybe in concurrency control since we are
discussing overcoming the limitation of our concurrency control choice.

Summary paragraph:
"can only happen if" => "can only be created if"

David J.

Re: gettext instructions for Windows seems wrong

2022-07-23 Thread Daniel Gustafsson

> On 22 Jul 2022, at 16:32, Bruce Momjian  wrote:
> 
> On Wed, Jan 19, 2022 at 10:24:57AM +0100, Daniel Gustafsson wrote:
>>> On 17 Jan 2022, at 23:55, PG Doc comments form  
>>> wrote:
>> 
>>> The instructions say that I have to edit config.pl (actually, I had to
>>> create it)..
>> 
>> I know next to nothing about NLS on Windows, but regardless of doc updates 
>> for
>> this I think we should clarify that config.pl may not exist, like how we do 
>> for
>> buildenv.pl.
> 
> Good point --- I have developed the attached patch to implement this.

Thanks, LGTM

--
Daniel Gustafsson   https://vmware.com/

pg document is not recorded：maintenance_work_mem can not large than 2GB in windows

2022-07-23 Thread yanliang lei

pg document is not  recorded：maintenance_work_mem can not large than 2GB in 
windows。

https://www.postgresql.org/docs/14/runtime-config-resource.html




I suggest that：modify maintenance_work_mem parameter description in the pg 
documents：

add maintenance_work_mem parameter size limit in windows platform




if you set maintenance_work_mem large that 2GB in windows platform

postgresql service can not start。 

in the windows event viewer ，thare is the following error:

"ERROR: 3072000 is outside the valid range for parameter "maintenance_work_mem" 
(64 .. 2097151)




and the work_mem parameter also has similar problems as above。

https://www.postgresql.org/message-id/17895.1315869...@sss.pgh.pa.us




so， It is best to add this description to the document about work_mem and 
maintenance_work_mem parameter

Re: documentation on HOT

Re: CREATE INDEX USING documentation

Re: documentation on HOT

Re: documentation on HOT

Re: gettext instructions for Windows seems wrong

pg document is not recorded：maintenance_work_mem can not large than 2GB in windows

6 matches

Site Navigation

Mail list logo

Footer information