Dear hackers, Based on the discussion Sawada-san pointed out[1] that the current approach of logical time-delayed avoids recycling WALs, I'm planning to close the CF entry once. This or the forked thread will be registered again after deciding on the alternative approach. Thank you very much for the time to join our discussions earlier.
I think to solve the issue, logical changes must be flushed on subscribers once and workers apply changes after spending a specified time. The straightforward approach for it is following physical replication - introduce the walreceiver process on the subscriber. We must research more, but at least there are some benefits: * Publisher can be shutted down even if the apply worker stuck. The stuck is more likely happen than physical replication, so this may improve the robustness. More detail, please see another thread[2]. * In case of synchronous_commit = 'remote_write', publisher can COMMIT faster. This is because walreceiver will flush changes immediately and reply soon. Even if time-delayed is enabled, the wait-time will not be increased. * May be used as an infrastructure of parallel apply for non-streaming transaction. The basic design of them are the similar - one process receive changes and others apply. I searched old discussions [3] and wiki pages, and I found that the initial prototype had a logical walreceiver but in a later version [4] apply worker directly received changes. I could not find the reason for the decision, but I suspect there were the following reasons. Could you please tell me the correct background about that? * Performance bottlenecks. If the walreceiver flush changes and the worker applies them, fsync() is called for every reception. * Complexity. In this design walreceiver and apply worker must share the progress of flush/apply. For crash recovery, more consideration is needed. The related discussion can be found in [5]. * Extendibility. In-core logical replication should be a sample of an external project. Apply worker is just a background worker that can be launched from an extension, so it can be easily understood. If it deeply depends on the walreceiver, other projects cannot follow. [1]: https://www.postgresql.org/message-id/CAD21AoAeG2%2BRsUYD9%2BmEwr8-rrt8R1bqpe56T2D%3DeuO-Qs-GAg%40mail.gmail.com [2]: https://www.postgresql.org/message-id/flat/TYAPR01MB586668E50FC2447AD7F92491F5E89%40TYAPR01MB5866.jpnprd01.prod.outlook.com [3]: https://www.postgresql.org/message-id/201206131327.24092.andres%402ndquadrant.com [4]: https://www.postgresql.org/message-id/37e19ad5-f667-2fe2-b95b-bba69c5b6...@2ndquadrant.com [5]: https://www.postgresql.org/message-id/1339586927-13156-12-git-send-email-andres%402ndquadrant.com Best Regards, Hayato Kuroda FUJITSU LIMITED