On 2016-03-08 09:28:15 +0100, Fabien COELHO wrote:
> >>>Now I cannot see how having one context per table space would have a
> >>>significant negative performance impact.
> >>The 'dirty data' etc. limits are global, not per block device. By having
> >>several contexts with unflushed dirty data the total amount of dirty
> >>data in the kernel increases.
> >Possibly, but how much? Do you have experimental data to back up that
> >this is really an issue?
> >We are talking about 32 (context size) * #table spaces * 8KB buffers = 4MB
> >of dirty buffers to manage for 16 table spaces, I do not see that as a
> >major issue for the kernel.
We flush in those increments, that doesn't mean there's only that much
dirty data. I regularly see one order of magnitude more being dirty.
I had originally kept it with one context per tablespace after
refactoring this, but found that it gave worse results in rate limited
loads even over only two tablespaces. That's on SSDs though.
> To complete the argument, the 4MB is just a worst case scenario, in reality
> flushing the different context would be randomized over time, so the
> frequency of flushing a context would be exactly the same in both cases
> (shared or per table space context) if the checkpoints are the same size,
> just that with shared table space each flushing potentially targets all
> tablespace with a few pages, while with the other version each flushing
> targets one table space only.
The number of pages still in writeback (i.e. for which sync_file_range
has been issued, but which haven't finished running yet) at the end of
the checkpoint matters for the latency hit incurred by the fsync()s from
smgrsync(); at least by my measurement.
My current plan is to commit this with the current behaviour (as in this
week[end]), and then do some actual benchmarking on this specific
part. It's imo a relatively minor detail.
Sent via pgsql-hackers mailing list (email@example.com)
To make changes to your subscription: