> 6 сент. 2020 г., в 18:26, Heikki Linnakangas <hlinn...@iki.fi> написал(а):
> 
> On 05/09/2020 14:53, Andrey M. Borodin wrote:
>> Thanks for ideas, Heikki. Please see v13 with proposed changes.
> 
> Thanks, that was quick!
> 
>> But I've found out that logging page-by-page slows down GiST build by
>> approximately 15% (when CPU constrained). Though In think that this
>> is IO-wise.
> Hmm, any ideas why that is? log_newpage_range() writes one WAL record for 32 
> pages, while now you're writing one record per page, so you'll have a little 
> bit more overhead from that. But 15% seems like a lot.
I do not know. I guess this can be some effect of pglz compression during cold 
stage. It can be slower and less compressive than pglz with cache table? But 
this is pointing into the sky.
Nevertheless, here's the patch identical to v13, but with 3rd part: log flushed 
pages with bunches of 32.
This brings CPU performance back and slightly better than before page-by-page 
logging.

Some details about test:
MacOS, 6-core i7
psql -c '\timing' -c "create table x as select point (random(),random()) from 
generate_series(1,10000000,1);" -c "create index on x using gist (point);"

With patch v13 this takes 20,567 seconds, with v14 18,149 seconds, v12 ~18,3s 
(which is closer to 10% btw, sorry for miscomputation). This was not 
statistically significant testing, just a quick laptop benchmark with 2-3 tests 
to verify stability.

Best regards, Andrey Borodin.

Attachment: v14-0001-Add-sort-support-for-point-gist_point_sortsuppor.patch
Description: Binary data

Attachment: v14-0002-Implement-GiST-build-using-sort-support.patch
Description: Binary data

Attachment: v14-0003-Log-GiST-build-with-packs-of-32-pages.patch
Description: Binary data

Reply via email to