Hi hackers!

I propose a slight change to WAL compression: compress body of big records, if 
it's bigger than some threshold.

===Rationale===
0. Better compression ratio for full page images when pages are compressed 
together.

Consider following test:

set wal_compression to 'zstd';
create table a as select random() from generate_series(1,1e7);
create index on a(random ); -- warmup to avoid FPI for hint on the heap
select  pg_stat_reset_shared('wal'); create index on a(random ); select 
pg_size_pretty(wal_bytes) from pg_stat_wal;

B-tree index will emit 97Mb of WAL instead of 125Mb when FPIs are compressed 
independently.

1. Compression of big records, that are not FPI. E.g. 2-pc records might be big 
enough to cross a threshold.

2. This might be a path to full WAL compression. In future I plan to propose a 
compression context: retaining compression dictionary between records. 
Obviously, the context cannot cross checkpoint borders. And a pool of contexts 
would be needed to fully utilize efficiency of compression codecs. Anyway - 
it's too early to theorize.

===Propotype===
I attach a prototype patch. It is functional, but some world tests fail. 
Probably, because they expect to generate more WAL without putting too much of 
entropy. Or, perhaps, I missed some bugs. In present version WAL_DEBUG does not 
indicate any problems. But a lot of quality assurance and commenting work is 
needed. It's a prototype.

To indicate that WAL record is compressed I use a bit in record->xl_info 
(XLR_COMPRESSED == 0x04). I found no places that use this bit...
If the record is compressed, record header is continued with information about 
compression: codec byte and uint32 of uncompressed xl_tot_len.

Currently, compression is done on StringInfo buffers, that are expanded before 
actual WALInsert() happens. If palloc() is needed during critical section, the 
compression is canceled. I do not like memory accounting before WALInsert, 
probably, something clever can be done about it.

WAL_DEBUG and wal_compression are enabled for debugging purposes. Of course, I 
do not propose to turn them on by default.


What do you think? Does this approach seem viable?


Best regards, Andrey Borodin.

Attachment: v0-0001-Compress-big-WAL-records.patch
Description: Binary data

Reply via email to