get rid of distprep?
I'm thinking about whether we should get rid of the distprep target, the step in the preparation of the official source tarball that creates a bunch of prebuilt files using bison, flex, perl, etc. for inclusion in the tarball. I think this concept is no longer fitting for contemporary software distribution. There is a lot of interest these days in making the artifacts of software distribution traceable, for security and legal reasons. You can trace the code from an author into Git, from Git into a tarball, somewhat from a tarball into a binary package (for example using reproduceable builds), from a binary package onto a user's system. Having some mystery prebuilt files in the middle there does not feel right. Packaging guidelines nowadays tend to disfavor such practices and either suggest, recommend, or require removing and rebuilding such files. This whole thing was fairly cavalier when we shipped gram.c, scan.c, and one or two other files, but now the number of prebuilt files is more than 100, not including the documentation, so this is a bit more serious. Practically, who even uses source tarballs these days? They are a vehicle for packagers, but packagers are not really helped by adding a bunch of prebuilt files. I think this practice started before there even were things like rpm. Nowadays, most people who want to work with the source should and probably do use git, so making the difference between a git checkout and a source tarball smaller would probably be good. And it would also make the actual tarball smaller. The practical costs of this are also not negligible. Because of the particular way configure handles bison and flex, it happens a bunch of times on new and test systems that the build proceeds and then tells you you should have installed bison 5 minutes ago. Also, extensions cannot rely on bison, flex, or perl being available, except it often works so it's not dealt with correctly. Who benefits from these prebuilt files? I doubt anyone actually has problems obtaining useful installations of bison, flex, or perl. There is the documentation build, but that also seems pretty robust nowadays and in any case you don't need to build the documentation to get a useful installation. We could make some adjustments so that not building the documentation is more accessible. The only users of this would appear to be those not using git and not using any packaging. That number is surely not zero, but it's probably very small and doesn't seem worth catering to specifically. Thoughts? -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Should the nbtree page split REDO routine's locking work more like the locking on the primary?
> 8 авг. 2020 г., в 03:28, Peter Geoghegan написал(а): > > On Thu, Aug 6, 2020 at 7:00 PM Peter Geoghegan wrote: >> On Thu, Aug 6, 2020 at 6:08 PM Tom Lane wrote: >>> +1 for making this more like what happens in original execution ("on the >>> primary", to use your wording). Perhaps what you suggest here is still >>> not enough like the original execution, but it sounds closer. >> >> It won't be the same as the original execution, exactly -- I am only >> thinking of holding on to same-level page locks (the original page, >> its new right sibling, and the original right sibling). > > I pushed a commit that reorders the lock acquisitions within > btree_xlog_unlink_page() -- they're now consistent with _bt_split() > (at least among sibling pages involved in the page split). Sounds great, thanks! Best regards, Andrey Borodin.
Re: [PATCH] Covering SPGiST index
> 7 авг. 2020 г., в 16:59, Pavel Borisov написал(а): > > As usual I very much appreciate your feedback Thanks for the patch! Looks interesting. On a first glance the whole concept of non-multicolumn index with included attributes seems...well, just difficult to understand. But I expect for SP-GiST this must be single key with multiple included attributes, right? I couldn't find a test that checks impossibility of on 2-column SP-GiST, only few asserts about it. Is this checked somewhere else? Thanks! Best regards, Andrey Borodin.
Re: [Patch] Optimize dropping of relation buffers using dlist
On Fri, Aug 7, 2020 at 9:33 AM Tom Lane wrote: > > Amit Kapila writes: > > On Sat, Aug 1, 2020 at 1:53 AM Andres Freund wrote: > >> We could also just use pg_class.relpages. It'll probably mostly be > >> accurate enough? > > > Don't we need the accurate 'number of blocks' if we want to invalidate > > all the buffers? Basically, I think we need to perform BufTableLookup > > for all the blocks in the relation and then Invalidate all buffers. > > Yeah, there is no room for "good enough" here. If a dirty buffer remains > in the system, the checkpointer will eventually try to flush it, and fail > (because there's no file to write it to), and then checkpointing will be > stuck. So we cannot afford to risk missing any buffers. > Right, this reminds me of the discussion we had last time on this topic where we decided that we can't even rely on using smgrnblocks to find the exact number of blocks because lseek might lie about the EOF position [1]. So, we anyway need some mechanism to push the information related to the "to be truncated or dropped relations" to the background worker (checkpointer and or others) to avoid flush issues. But, maybe it is better to push the responsibility of invalidating the buffers for truncated/dropped relation to the background process. However, I feel for some cases where relation size is greater than the number of shared buffers there might not be much benefit in pushing this operation to background unless there are already a few other relation entries (for dropped relations) so that cost of scanning the buffers can be amortized. [1] - https://www.postgresql.org/message-id/16664.1435414204%40sss.pgh.pa.us -- With Regards, Amit Kapila.
Re: [Patch] Optimize dropping of relation buffers using dlist
On Fri, Aug 7, 2020 at 11:03 PM Robert Haas wrote: > > On Fri, Aug 7, 2020 at 12:52 PM Tom Lane wrote: > > At least in the case of segment zero, the file will still exist. It'll > > have been truncated to zero length, and if the filesystem is stupid about > > holes in files then maybe a write to a high block number would consume > > excessive disk space, but does anyone still care about such filesystems? > > I don't remember at the moment how we handle higher segments, > > We do unlink them and register the request to forget the Fsync requests for those. See mdunlinkfork. > > but likely > > we could make them still exist too, postponing all the unlinks till after > > checkpoint. Or we could just have the backends give up on recycling a > > particular buffer if they can't write it (which is the response to an I/O > > failure already, I hope). > > Note that we don't often try to flush the buffers from the backend. We first try to forward the request to checkpoint queue and only if the queue is full, the backend tries to flush it, so even if we decide to give up flushing such a buffer (where we get an error) via backend, it shouldn't impact very many cases. I am not sure but if we can somehow reliably distinguish this type of error from any other I/O failure then we can probably give up on flushing this buffer and continue or maybe just retry to push this request to checkpointer. > > None of this sounds very appealing. Postponing the unlinks means > postponing recovery of the space at the OS level, which I think will > be noticeable and undesirable for users. The other notions all seem to > involve treating as valid on-disk states we currently treat as > invalid, and our sanity checks in this area are already far too weak. > And all you're buying for it is putting a hash table that would > otherwise be shared memory into backend-private memory, which seems > like quite a minor gain. Having that information visible to everybody > seems a lot cleaner. > The one more benefit of giving this responsibility to a single process like checkpointer is that we can avoid unlinking the relation until we scan all the buffers corresponding to it. Now, surely keeping it in shared memory and allow other processes to work on it has other merits which are that such buffers might get invalidated faster but not sure we can retain the benefit of another approach which is to perform all such invalidation of buffers before unlinking the relation's first segment. -- With Regards, Amit Kapila.
Re: LSM tree for Postgres
On 07.08.2020 15:31, Alexander Korotkov wrote: ср, 5 авг. 2020 г., 09:13 Konstantin Knizhnik : Concerning degrade of basic index - B-Tree itself is balanced tree. Yes, insertion of random keys can cause split of B-Tree page. In the worst case half of B-Tree page will be empty. So B-Tree size will be two times larger than ideal tree. It may cause degrade up to two times. But that is all. There should not be infinite degrade of speed tending to zero. My concerns are not just about space utilization. My main concern is about the order of the pages. After the first merge the base index will be filled in key order. So physical page ordering perfectly matches their logical ordering. After the second merge some pages of base index splits, and new pages are added to the end of the index. Splits also happen in key order. So, now physical and logical orderings match within two extents corresponding to first and second merges, but not within the whole tree. While there are only few such extents, disk page reads may in fact be mostly sequential, thanks to OS cache and readahead. But finally, after many merges, we can end up with mostly random page reads. For instance, leveldb doesn't have a problem of ordering degradation, because it stores levels in sorted files. I agree with your that loosing sequential order of B-Tree pages may have negative impact on performance. But it first of all critical for order-by and range queries, when we should traverse several subsequent leave pages. It is less critical for exact-search or delete/insert operations. Efficiency of merge operations mostly depends on how much keys will be stored at the same B-Tree page. And it is first of all determined by size of top index and key distribution.
Re: get rid of distprep?
Peter Eisentraut writes: > I'm thinking about whether we should get rid of the distprep target, ... > Who benefits from these prebuilt files? I doubt anyone actually has > problems obtaining useful installations of bison, flex, or perl. I'm sure it was a bigger issue twenty years ago, but yeah, nowadays our minimum requirements for those tools are so ancient that everybody who cares to build from source should have usable versions available. I think the weak spot in your argument, though, is the documentation. There is basically nothing that is standardized or reproducible in that toolchain, as every platform names and subdivides the relevant packages differently, if they exist at all. I was reminded of that just recently when I updated my main workstation to RHEL8, and had to jump through a lot of hoops to get everything installed that's needed to build the docs (and I still lack the tools for some of the weirder products such as epub). I'd be willing to say "you must have bison, flex, and perl to build" --- and maybe we could even avoid having a long discussion about what "perl" means in this context --- but I fear the doc tools situation would be a mess. > The only users of this > would appear to be those not using git and not using any packaging. No, there's the packagers themselves who would be bearing the brunt of rediscovering how to build the docs on their platforms. And if the argument is that there's a benefit to them of making the build more reproducible, I'm not sure I buy it, because of (1) timestamps in the output files and (2) docbook's willingness to try to download missing bits off the net. (2) is a huge and not very obvious hazard to reproducibility. But maybe you ought to be surveying -packagers about the question instead of theorizing here. Would *they* see this as a net benefit? One other point to consider is that distprep or no distprep, I'd be quite sad if the distclean target went away. That's extremely useful in normal development workflows to tear down everything that depends on configure output, without giving up some of the more expensive build products such as gram.c and preproc.c. regards, tom lane
Re: Replace remaining StrNCpy() by strlcpy()
Peter Eisentraut writes: > I removed namecpy() altogether because you can just use struct assignment. Makes sense, and I notice it was unused anyway. v3 passes eyeball examination (I didn't bother running tests), with only one remaining nit: the proposed commit message says They are equivalent, which per this thread is incorrect. Somebody might possibly refer to this commit for guidance in updating third-party code, so I don't think we want to leave a misleading claim here. Perhaps something like They are equivalent, except that StrNCpy zero-fills the entire destination buffer instead of providing just one trailing zero. For all but a tiny number of callers, that's just overhead rather than being desirable. regards, tom lane
Re: walsender waiting_for_ping spuriously set
Pushed. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Amcheck: do rightlink verification with lock coupling
On Thu, Aug 6, 2020 at 10:59 PM Andrey M. Borodin wrote: > But having complete solution with no false positives seems much better. Agreed. I know that you didn't pursue this for no reason -- having the check available makes bt_check_index() a lot more valuable in practice. It detects what is actually a classic example of subtle B-Tree corruption (left link corruption), which appears in Modern B-Tree techniques in its discussion of corruption detection. It's actually the canonical example of how B-Tree corruption can be very subtle in the real world. I pushed a cleaned up version of this patch just now. I added some commentary about this canonical example in header comments for the new function. Thanks -- Peter Geoghegan
Re: LSM tree for Postgres
On Sat, Aug 8, 2020 at 5:07 PM Konstantin Knizhnik wrote: > I agree with your that loosing sequential order of B-Tree pages may have > negative impact on performance. > But it first of all critical for order-by and range queries, when we > should traverse several subsequent leave pages. > It is less critical for exact-search or delete/insert operations. > Efficiency of merge operations mostly depends on how much keys > will be stored at the same B-Tree page. What do you mean by "mostly"? Given PostgreSQL has quite small (8k) pages, sequential read in times faster than random read on SSDs (dozens of times on HDDs). I don't think this is something to neglect. > And it is first of all > determined by size of top index and key distribution. How can you be sure that the top index can fit memory? On production systems, typically there are multiple consumers of memory: other tables, indexes, other LSMs. This is one of reasons why LSM implementations have multiple levels: they don't know in advance which levels fit memory. Another reason is dealing with very large datasets. And I believe there is a quite strong reason to keep page order sequential within level. I'm OK with your design for a third-party extension. It's very cool to have. But I'm -1 for something like this to get into core PostgreSQL, assuming it's feasible to push some effort and get state-of-art LSM there. -- Regards, Alexander Korotkov
2020-08-13 Update + PostgreSQL 13 Beta 3 Release Announcement Draft
Hi, Attached is a draft of the release announcement for the update release on 2020-08-13, which also includes the release of PostgreSQL 13 Beta 3. Reviews and feedback are welcome. This is a fairly hefty release announcement as it includes notes both about the update release and the beta. I tried to keep the notes about Beta 3 focused on the significant changes, with a reference to the open items page. If you believe I missed something that is significant, please let me know. Please be sure all feedback is delivered by 2020-08-12 AoE. Thanks, Jonathan 2020-08-13 Cumulative Update Update The PostgreSQL Global Development Group has released an update to all supported versions of our database system, including 12.4, 11.9, 10.14, 9.6.19, and 9.5.23, as well as the 3rd Beta release of PostgreSQL 13. This release fixes over 50 bugs reported over the last three months. Please plan to update at your earliest convenience. A Note on the PostgreSQL 13 Beta This release marks the third beta release of PostgreSQL 13 and puts the community one step closer to general availability this fall. In the spirit of the open source PostgreSQL community, we strongly encourage you to test the new features of PostgreSQL 13 in your database systems to help us eliminate any bugs or other issues that may exist. While we do not advise you to run PostgreSQL 13 Beta 3 in your production environments, we encourage you to find ways to run your typical application workloads against this beta release. Your testing and feedback will help the community ensure that the PostgreSQL 13 release upholds our standards of providing a stable, reliable release of the world's most advanced open source relational database. PostgreSQL 9.5 EOL Notice - PostgreSQL 9.5 will stop receiving fixes on February 11, 2021. If you are running PostgreSQL 9.5 in a production environment, we suggest that you make plans to upgrade to a newer, supported version of PostgreSQL. Please see our [versioning policy](https://www.postgresql.org/support/versioning/) for more information. Bug Fixes and Improvements -- This update also fixes over 50 bugs that were reported in the last several months. Some of these issues affect only version 12, but may also affect all supported versions. Some of these fixes include: * Fix edge cases in partition pruning involving multiple partition key columns with multiple or no constraining WHERE clauses. * Several fixes for query planning and execution involving partitions. * Fix for determining when to execute a column-specific UPDATE trigger on a logical replication subscriber. * `pg_replication_slot_advance()` now updates the oldest xmin and LSN values, as the failure to do this could prevent resources (e.g. WAL files) from being cleaned up. * Performance improvements for `ts_headline()`. * Ensure that `pg_read_file()` and related functions read until EOF is reached, which fixes compatibility with pipes and other virtual files. * Forbid numeric `NaN` values in jsonpath computations, which do not exist in SQL nor JSON. * Several fixes for `NaN` inputs with aggregates functions. This fixes a change in PostgreSQL 12 where `NaN` values in the following aggregates to emitted value of `0` instead of `NaN`: `corr()`, `covar_pop()`, `regr_intercept()`, `regr_r2()`, `regr_slope()`, `regr_sxx()`, `regr_sxy()`, `regr_syy()`, `stddev_pop()`, and `var_pop()`. * `time` and `timetz` values greater than `24:00:00` are now rejected. * Several fixes for `EXPLAIN`, including a fix for reporting resource usage when a plan uses parallel works with "Gather Merge" nodes. * Fix timing of constraint revalidation in `ALTER TABLE` that could lead to odd errors. * Fix for REINDEX CONCURRENTLY that could prevent old values from being included in future logical decoding output. * Fix for LATERAL references that could potentially cause crashes during query execution. * Use the collation specified for a query when estimating operator costs * Fix conflict-checking anomalies in SERIALIZABLE transaction isolation mode. * Ensure checkpointer process discards file sync requests when fsync is off * Fix issue where `pg_control` could be written out with an inconsistent checksum, which could lead to the inability to restart the database if it crashed before the next `pg_control` update. * Ensure that libpq continues to try to read from the database connection socket after a write failure, as this allows the connection to collect any final error messages from the server. * Report out-of-disk-space errors properly in `pg_dump` and `pg_basebackup` * Several fixes for `pg_restore`, including a fix for parallel restore on tables that have both table-level and column-level privileges. * Fix for `pg_upgrade` to ensure it runs with `vacuum_defer_cleanup_age` set to `0`. * Fix how `pg_rewind` handles just-deleted files in the source data directory * Fix failu
Re: LSM tree for Postgres
On 08.08.2020 21:18, Alexander Korotkov wrote: On Sat, Aug 8, 2020 at 5:07 PM Konstantin Knizhnik wrote: I agree with your that loosing sequential order of B-Tree pages may have negative impact on performance. But it first of all critical for order-by and range queries, when we should traverse several subsequent leave pages. It is less critical for exact-search or delete/insert operations. Efficiency of merge operations mostly depends on how much keys will be stored at the same B-Tree page. What do you mean by "mostly"? Given PostgreSQL has quite small (8k) pages, sequential read in times faster than random read on SSDs (dozens of times on HDDs). I don't think this is something to neglect. When yo insert one record in B-Tree, the order of pages doesn't matter at all. If you insert ten records at one leaf page then order is also not so important. If you insert 100 records, 50 got to one page and 50 to the next page, then insertion may be faster if second page follows on the disk first one. But such insertion may cause page split and so allocation of new page, so sequential write order can still be violated. And it is first of all determined by size of top index and key distribution. How can you be sure that the top index can fit memory? On production systems, typically there are multiple consumers of memory: other tables, indexes, other LSMs. This is one of reasons why LSM implementations have multiple levels: they don't know in advance which levels fit memory. Another reason is dealing with very large datasets. And I believe there is a quite strong reason to keep page order sequential within level. There is no any warranty that top index is kept in memory. But as far top index pages are frequently accessed, I hope that buffer management cache replacement algorithm does it best to keep them in memory. I'm OK with your design for a third-party extension. It's very cool to have. But I'm -1 for something like this to get into core PostgreSQL, assuming it's feasible to push some effort and get state-of-art LSM there. I realize that it is not true LSM. But still I wan to notice that it is able to provide ~10 times increase of insert speed when size of index is comparable with RAM size. And "true LSM" from RocksDB shows similar results. May be if size of index will be 100 times larger then size of RAM, RocksDB will be significantly faster than Lsm3. But modern servers has 0.5-1Tb of RAM. Can't believe that there are databases with 100Tb indexes.
Re: LSM tree for Postgres
On Sat, Aug 8, 2020 at 11:49 PM Konstantin Knizhnik wrote: > On 08.08.2020 21:18, Alexander Korotkov wrote: > > On Sat, Aug 8, 2020 at 5:07 PM Konstantin Knizhnik > > wrote: > >> I agree with your that loosing sequential order of B-Tree pages may have > >> negative impact on performance. > >> But it first of all critical for order-by and range queries, when we > >> should traverse several subsequent leave pages. > >> It is less critical for exact-search or delete/insert operations. > >> Efficiency of merge operations mostly depends on how much keys > >> will be stored at the same B-Tree page. > > What do you mean by "mostly"? Given PostgreSQL has quite small (8k) > > pages, sequential read in times faster than random read on SSDs > > (dozens of times on HDDs). I don't think this is something to > > neglect. > > When yo insert one record in B-Tree, the order of pages doesn't matter > at all. > If you insert ten records at one leaf page then order is also not so > important. > If you insert 100 records, 50 got to one page and 50 to the next page, > then insertion may be faster if second page follows on the disk first one. > But such insertion may cause page split and so allocation of new page, > so sequential write order can still be violated. Sorry, I've no idea of what you're getting at. > >> And it is first of all > >> determined by size of top index and key distribution. > > How can you be sure that the top index can fit memory? On production > > systems, typically there are multiple consumers of memory: other > > tables, indexes, other LSMs. This is one of reasons why LSM > > implementations have multiple levels: they don't know in advance which > > levels fit memory. Another reason is dealing with very large > > datasets. And I believe there is a quite strong reason to keep page > > order sequential within level. > > There is no any warranty that top index is kept in memory. > But as far top index pages are frequently accessed, I hope that buffer > management cache replacement > algorithm does it best to keep them in memory. So, the top index should be small enough that we can safely assume it wouldn't be evicted from cache on a heavily loaded production system. I think it's evident that it should be in orders of magnitude less than the total amount of server RAM. > > I'm OK with your design for a third-party extension. It's very cool > > to have. But I'm -1 for something like this to get into core > > PostgreSQL, assuming it's feasible to push some effort and get > > state-of-art LSM there. > I realize that it is not true LSM. > But still I wan to notice that it is able to provide ~10 times increase > of insert speed when size of index is comparable with RAM size. > And "true LSM" from RocksDB shows similar results. It's very far from being shown. All the things you've shown is a naive benchmark. I don't object that your design can work out some cases. And it's great that we have the lsm3 extension now. But I think for PostgreSQL core we should think about better design. > May be if size of > index will be 100 times larger then > size of RAM, RocksDB will be significantly faster than Lsm3. But modern > servers has 0.5-1Tb of RAM. > Can't believe that there are databases with 100Tb indexes. Comparison of whole RAM size to single index size looks plain wrong for me. I think we can roughly compare whole RAM size to whole database size. But also not the whole RAM size is always available for caching data. Let's assume half of RAM is used for caching data. So, a modern server with 0.5-1Tb of RAM, which suffers from random B-tree insertions and badly needs LSM-like data-structure, runs a database of 25-50Tb. Frankly speaking, there is nothing counterintuitive for me. -- Regards, Alexander Korotkov
Re: 回复:how to create index concurrently on partitioned table
On Sat, Aug 08, 2020 at 01:37:44AM -0500, Justin Pryzby wrote: > That gave me the idea to layer CIC on top of Reindex, since I think it does > exactly what's needed. For now, I would recommend to focus first on 0001 to add support for partitioned tables and indexes to REINDEX. CIC is much more complicated btw, but I am not entering in the details now. - /* -* This may be useful when implemented someday; but that day is not today. -* For now, avoid erroring out when called in a multi-table context -* (REINDEX SCHEMA) and happen to come across a partitioned table. The -* partitions may be reindexed on their own anyway. -*/ + /* Avoid erroring out */ if (rel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE) { This comment does not help, and actually this becomes incorrect as reindex for this relkind becomes supported once 0001 is done. + case RELKIND_INDEX: + reindex_index(inhrelid, false, get_rel_persistence(inhrelid), + options | REINDEXOPT_REPORT_PROGRESS); + break; + case RELKIND_RELATION: + (void) reindex_relation(inhrelid, + REINDEX_REL_PROCESS_TOAST | + REINDEX_REL_CHECK_CONSTRAINTS, + options | REINDEXOPT_REPORT_PROGRESS); ReindexPartitionedRel() fails to consider the concurrent case here for partition indexes and tables, as reindex_index()/reindex_relation() are the APIs used in the non-concurrent case. Once you consider the concurrent case correctly, we also need to be careful with partitions that have a temporary persistency (note that we don't allow partition trees to mix persistency types, all partitions have to be temporary or permanent). I think that you are right to make the entry point to handle partitioned index in ReindexIndex() and partitioned table in ReindexTable(), but the structure of the patch should be different: - The second portion of ReindexMultipleTables() should be moved into a separate routine, taking in input a list of relation OIDs. This needs to be extended a bit so as reindex_index() gets called for an index relkind if the relpersistence is temporary or if we have a non-concurrent reindex. The idea is that we finish with a single code path able to work on a list of relations. And your patch adds more of that as of ReindexPartitionedRel(). - We should *not* handle directly partitioned index and/or table in ReindexRelationConcurrently() to not complicate the logic where we gather all the indexes of a table/matview. So I think that the list of partition indexes/tables to work on should be built directly in ReindexIndex() and ReindexTable(), and then this should call the second part of ReindexMultipleTables() refactored in the previous point. This way, each partition index gets done individually in its own transaction. For a partition table, all indexes of this partition are rebuilt in the same set of transactions. For the concurrent case, we have already reindex_concurrently_swap that it able to switch the dependencies of two indexes within a partition tree, so we can rely on that so as a failure in the middle of the operation never leaves the a partition structure in an inconsistent state. -- Michael signature.asc Description: PGP signature
Re: Allow some recovery parameters to be changed with reload
On Wed, Aug 05, 2020 at 11:41:49AM -0400, Robert Haas wrote: > On Sat, Mar 28, 2020 at 7:21 AM Sergei Kornilov wrote: >> So... >> We call restore_command only when walreceiver is stopped. >> We use restore_command only in startup process - so we have no race >> condition between processes. >> We have some issues here? Or we can just make restore_command reloadable as >> attached? > > I don't see the problem here, either. Does anyone else see a problem, > or some reason not to press forward with this? Sorry for the late reply. I have been looking at that stuff again, and restore_command can be called in the context of a WAL sender process within the page_read callback of logical decoding via XLogReadDetermineTimeline(), as readTimeLineHistory() could look for a timeline history file. So restore_command is not used only in the startup process. -- Michael signature.asc Description: PGP signature
Re: Unnecessary delay in streaming replication due to replay lag
I would like to revive this thready by submitting a rebased patch to start streaming replication without waiting for startup process to finish replaying all WAL. The start LSN for streaming is determined to be the LSN that points to the beginning of the most recently flushed WAL segment. The patch passes tests under src/test/recovery and top level “make check”. v2-0001-Start-WAL-receiver-before-startup-process-replays.patch Description: v2-0001-Start-WAL-receiver-before-startup-process-replays.patch
Re: Amcheck: do rightlink verification with lock coupling
> 8 авг. 2020 г., в 23:14, Peter Geoghegan написал(а): > > I pushed a cleaned up version of this patch just now. I added some > commentary about this canonical example in header comments for the new > function. Thanks for working on this! Best regards, Andrey Borodin.