On Sat, Mar 15, 2014 at 11:27 PM, Heikki Linnakangas < hlinnakan...@vmware.com> wrote:
> On 03/15/2014 08:40 PM, Fujii Masao wrote: > >> Hi, >> >> I executed the following statements in HEAD and 9.3, and compared >> the size of WAL which were generated by data insertion in GIN index. >> >> --------------------- >> CREATE EXTENSION pg_trgm; >> CREATE TABLE hoge (col1 text); >> CREATE INDEX hogeidx ON hoge USING gin (col1 gin_trgm_ops) WITH >> (FASTUPDATE = off); >> >> CHECKPOINT; >> SELECT pg_switch_xlog(); >> SELECT pg_switch_xlog(); >> >> SELECT pg_current_xlog_location(); >> INSERT INTO hoge SELECT 'POSTGRESQL' FROM generate_series(1, 1000000); >> SELECT pg_current_xlog_location(); >> --------------------- >> >> The results of WAL size are >> >> 960 MB (9.3) >> 2113 MB (HEAD) >> >> The WAL size in HEAD was more than two times bigger than that in 9.3. >> Recently the source code of GIN index has been changed dramatically. >> Is the increase in GIN-related WAL intentional or a bug? >> > > It was somewhat expected. Updating individual items on the new-format GIN > pages requires decompressing and recompressing the page, and the > recompressed posting lists need to be WAL-logged. Which generates much > larger WAL records. > > That said, I didn't expect the difference to be quite that big when you're > appending to the end of the table. When the new entries go to the end of > the posting lists, you only need to recompress and WAL-log the last posting > list, which is max 256 bytes long. But I guess that's still a lot more WAL > than in the old format. > > That could be optimized, but I figured we can live with it, thanks to the > fastupdate feature. Fastupdate allows amortizing that cost over several > insertions. But of course, you explicitly disabled that... Let me know if you want me to write patch addressing this issue. ------ With best regards, Alexander Korotkov.