On 29/03/16 18:25, Alvaro Herrera wrote:
+ /*-------------------------------------------------------------------------
>+ * API for construction of generic xlog records
>+ *
>+ * This API allows user to construct generic xlog records which describe
>+ * difference between pages in a generic way. This is useful for
>+ * extensions which provide custom access methods because they can't
>+ * register their own WAL redo routines.
>+ *
>+ * Each record must be constructed by following these steps:
>+ * 1) GenericXLogStart(relation) - start construction of a generic xlog
>+ * record for the given relation.
>+ * 2) GenericXLogRegister(buffer, isNew) - register one or more buffers
>+ * for the record. This function returns a copy of the page
>+ * image where modifications can be performed. The second argument
>+ * indicates if the block is new (i.e. a full page image should be
taken).
>+ * 3) Apply modification of page images obtained in the previous step.
>+ * 4) GenericXLogFinish() - finish construction of generic xlog record.
>+ *
>+ * The xlog record construction can be canceled at any step by calling
>+ * GenericXLogAbort(). All changes made to page images copies will be
>+ * discarded.
>+ *
>+ * Please, note the following points when constructing generic xlog records.
>+ * - No direct modifications of page images are allowed! All modifications
>+ * must be done in the copies returned by GenericXLogRegister(). In
other
>+ * words the code which makes generic xlog records must never call
>+ * BufferGetPage().
>+ * - Registrations of buffers (step 2) and modifications of page images
>+ * (step 3) can be mixed in any sequence. The only restriction is
that
>+ * you can only modify page image after registration of corresponding
>+ * buffer.
>+ * - After registration, the buffer also can be unregistered by calling
>+ * GenericXLogUnregister(buffer). In this case the changes made in
>+ * that particular page image copy will be discarded.
>+ * - Generic xlog assumes that pages are using standard layout, i.e., all
>+ * data between pd_lower and pd_upper will be discarded.
>+ * - Maximum number of buffers simultaneously registered for a generic xlog
>+ * record is MAX_GENERIC_XLOG_PAGES. An error will be thrown if
this limit
>+ * is exceeded.
>+ * - Since you modify copies of page images, GenericXLogStart() doesn't
>+ * start a critical section. Thus, you can do memory allocation,
error
>+ * throwing etc between GenericXLogStart() and GenericXLogFinish().
>+ * The actual critical section is present inside GenericXLogFinish().
>+ * - GenericXLogFinish() takes care of marking buffers dirty and setting
their
>+ * LSNs. You don't need to do this explicitly.
>+ * - For unlogged relations, everything works the same except there is no
>+ * WAL record produced. Thus, you typically don't need to do any
explicit
>+ * checks for unlogged relations.
>+ * - If registered buffer isn't new, generic xlog record contains delta
>+ * between old and new page images. This delta is produced by per
byte
>+ * comparison. This current delta mechanism is not effective for
data shifts
>+ * inside the page and may be improved in the future.
>+ * - Generic xlog redo function will acquire exclusive locks on buffers
>+ * in the same order they were registered. After redo of all
changes,
>+ * the locks will be released in the same order.
>+ *
>+ *
>+ * Internally, delta between pages consists of set of fragments. Each
>+ * fragment represents changes made in given region of page. A fragment is
>+ * described as follows:
>+ *
>+ * - offset of page region (OffsetNumber)
>+ * - length of page region (OffsetNumber)
>+ * - data - the data to place into described region ('length' number of
bytes)
>+ *
>+ * Unchanged regions of page are not represented in the delta. As a result,
>+ * the delta can be more compact than full page image. But if the unchanged
region
>+ * of the page is less than fragment header (offset and length) the delta
>+ * would be bigger than the full page image. For this reason we break into
fragments
>+ * only if the unchanged region is bigger than MATCH_THRESHOLD.
>+ *
>+ * The worst case for delta size is when we didn't find any unchanged region
>+ * in the page. Then size of delta would be size of page plus size of
fragment
>+ * header.
>+ */
>+ #define FRAGMENT_HEADER_SIZE (2 * sizeof(OffsetNumber))
>+ #define MATCH_THRESHOLD FRAGMENT_HEADER_SIZE
>+ #define MAX_DELTA_SIZE BLCKSZ + FRAGMENT_HEADER_SIZE
I incorporated your changes and did some additional refinements on top
of them still.
Attached is delta against v12, that should cause less issues when
merging for Teodor.
--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
diff --git a/src/backend/access/transam/generic_xlog.c
b/src/backend/access/transam/generic_xlog.c
index 7ca03bf..eab40a2 100644
--- a/src/backend/access/transam/generic_xlog.c
+++ b/src/backend/access/transam/generic_xlog.c
@@ -19,78 +19,77 @@
#include "utils/memutils.h"
/*-------------------------------------------------------------------------
- * API for construction of generic xlog records
+ * API for construction of generic xlThis is useful forog records
*
- * This API allows user to construct generic xlog records which are
- * describing difference between pages in general way. Thus it's useful
- * for extension which provides custom access methods because they couldn't
- * register their own WAL redo routines.
+ * This API allows user to construct generic xlog records which describe
+ * difference between pages in a generic way. This is useful for extensions
+ * which provide custom access methods because they can't register their own
+ * WAL redo routines.
*
- * Generic xlog record should be constructed in following steps.
- * 1) GenericXLogStart(relation) - start construction of generic xlog
- * record for given relation.
+ * Each record must be constructed by following these steps:
+ * 1) GenericXLogStart(relation) - start construction of a generic xlog
+ * record for the given relation.
* 2) GenericXLogRegister(buffer, isNew) - register one or more buffers
- * for generic xlog record. This function return a copy of page image
- * where modifications should be performed. The second argument
- * indicates that block is new and full image should be taken.
- * 3) Do modification of page images obtained in previous step.
+ * for generic xlog record. This function returns a copy of the page
image
+ * where modifications can be performed. The second argument indicates
+ * if block is new (i.e. a full page image should be taken).
+ * 3) Apply modification of page images obtained in the previous step.
* 4) GenericXLogFinish() - finish construction of generic xlog record.
*
- * Please, note following points while constructing generic xlog records.
+ * The xlog record construction can be canceled at any step by calling
+ * GenericXLogAbort(). All changes made to page images copies will be
+ * discarded.
+ *
+ * Please note following points when constructing generic xlog records.
* - No direct modifications of page images are allowed! All modifications
- * should be done in copies returned by GenericXLogRegister(). Literally
- * code which makes generic xlog records should never call
- * BufferGetPage() function.
- * - On any step generic xlog record construction could be canceled by
- * calling GenericXLogAbort(). All changes made in page images copies
- * would be discarded.
+ * must be done in copies returned by GenericXLogRegister(). In other
words
+ * code which makes generic xlog records must never call BufferGetPage().
* - Registrations of buffers (step 2) and modifications of page images
- * (step 3) could be mixed in any sequence. The only restriction is that
- * you can modify page image only after registration of corresponding
+ * (step 3) can be mixed in any sequence. The only restriction is that
+ * you can only modify page image after registration of corresponding
* buffer.
- * - After registration buffer also can be unregistered by calling
- * GenericXLogUnregister(buffer). In this case changes made in particular
- * page image copy will be discarded.
+ * - After registration, the buffer can also be unregistered by calling
+ * GenericXLogUnregister(buffer). In this case the changes made in
+ * that particular page image copy will be discarded.
* - Generic xlog assumes that pages are using standard layout. I.e. all
* information between pd_lower and pd_upper will be discarded.
- * - Maximum number of buffers simultaneously registered for generic xlog
- * is MAX_GENERIC_XLOG_PAGES. Error would be thrown if this limit
+ * - Maximum number of buffers simultaneously registered for a generic xlog
+ * is MAX_GENERIC_XLOG_PAGES. Error will be thrown if this limit is
* exceeded.
* - Since you modify copies of page images, GenericXLogStart() doesn't
* start a critical section. Thus, you can do memory allocation, error
* throwing etc between GenericXLogStart() and GenericXLogFinish().
- * Actual critical section present inside GenericXLogFinish().
- * - GenericXLogFinish() takes care about marking buffers dirty and setting
+ * The actual critical section is present inside GenericXLogFinish().
+ * - GenericXLogFinish() takes care of marking buffers dirty and setting
* their LSNs. You don't need to do this explicitly.
- * - For unlogged relations, everything work the same expect there is no
+ * - For unlogged relations, everything works the same except there is no
* WAL record produced. Thus, you typically don't need to do any explicit
* checks for unlogged relations.
* - If registered buffer isn't new, generic xlog record contains delta
- * between old and new page images. This delta is produced by per byte
- * comparison. Current delta mechanist is not effective for data shift
- * inside the page. However, it could be improved in further versions.
+ * between old and new page images. This delta is produced using per byte
+ * comparison. The current delta mechanist is not effective for data
shifts
+ * inside the page and may be improved in the future.
* - Generic xlog redo function will acquire exclusive locks to buffers
- * in the same order they were registered. After redo of all changes
- * locks would be released in the same order. That could makes sense for
- * concurrency.
+ * in the same order as they were registered. After redo of all changes,
+ * locks will be released in the same order.
*
- * Internally delta between pages consists of set of fragments. Each fragment
- * represents changes made in given region of page. Fragment is described
- * as following.
+ * Internally, delta between pages consists of set of fragments. Each
+ * fragment represents changes made in a given region of a page. A fragment
+ * is described as following.
*
* - offset of page region (OffsetNumber)
* - length of page region (OffsetNumber)
* - data - the data to place into described region ('length' number of bytes)
*
- * Unchanged regions of page are uncovered by these fragments. This is why
- * delta could be more compact than full page image. But if unchanged region
- * of page is less than fragment header (offset and length) then it would
- * increase size of delta instead of decreasing. Thus, we break fragment only
- * for unchanged regions greater than MATCH_THRESHOLD.
+ * Unchanged regions of page are not represented in the delta. As a result
+ * delta can be more compact than the full page image. But if the unchanged
+ * region of the page is smaller than the fragment header (offset and length)
+ * the delta would be bigger than the full page image. For this reason we
+ * break fragment only if the unchanged region is bigger than MATCH_THRESHOLD.
*
* The worst case for delta size is when we didn't find any unchanged region
- * in the page. Then size of delta would be size of page plus size of fragment
- * header.
+ * in the page. The size of delta will be size of page plus size of fragment
+ * header in that case.
*/
#define FRAGMENT_HEADER_SIZE (2 * sizeof(OffsetNumber))
#define MATCH_THRESHOLD FRAGMENT_HEADER_SIZE
@@ -168,8 +167,8 @@ writeDelta(PageData *pageData)
bool match;
/*
- * Check if bytes in old and new page images matches. We don't
rely
- * data in unallocated area between pd_lower and pd_upper.
Thus we
+ * Check if bytes in old and new page images matches. We don't
care
+ * about data in unallocated area between pd_lower and
pd_upper. We
* assume unallocated area to expand with unmatched bytes.
Bytes
* inside unallocated area are assumed to always match.
*/
--
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers