Hi hackers! I propose a slight change to WAL compression: compress body of big records, if it's bigger than some threshold.
===Rationale=== 0. Better compression ratio for full page images when pages are compressed together. Consider following test: set wal_compression to 'zstd'; create table a as select random() from generate_series(1,1e7); create index on a(random ); -- warmup to avoid FPI for hint on the heap select pg_stat_reset_shared('wal'); create index on a(random ); select pg_size_pretty(wal_bytes) from pg_stat_wal; B-tree index will emit 97Mb of WAL instead of 125Mb when FPIs are compressed independently. 1. Compression of big records, that are not FPI. E.g. 2-pc records might be big enough to cross a threshold. 2. This might be a path to full WAL compression. In future I plan to propose a compression context: retaining compression dictionary between records. Obviously, the context cannot cross checkpoint borders. And a pool of contexts would be needed to fully utilize efficiency of compression codecs. Anyway - it's too early to theorize. ===Propotype=== I attach a prototype patch. It is functional, but some world tests fail. Probably, because they expect to generate more WAL without putting too much of entropy. Or, perhaps, I missed some bugs. In present version WAL_DEBUG does not indicate any problems. But a lot of quality assurance and commenting work is needed. It's a prototype. To indicate that WAL record is compressed I use a bit in record->xl_info (XLR_COMPRESSED == 0x04). I found no places that use this bit... If the record is compressed, record header is continued with information about compression: codec byte and uint32 of uncompressed xl_tot_len. Currently, compression is done on StringInfo buffers, that are expanded before actual WALInsert() happens. If palloc() is needed during critical section, the compression is canceled. I do not like memory accounting before WALInsert, probably, something clever can be done about it. WAL_DEBUG and wal_compression are enabled for debugging purposes. Of course, I do not propose to turn them on by default. What do you think? Does this approach seem viable? Best regards, Andrey Borodin.
v0-0001-Compress-big-WAL-records.patch
Description: Binary data