On Mon, Mar 16, 2020 at 09:21:22AM +0100, Julien Rouhaud wrote: > On Mon, Mar 16, 2020 at 12:29:28PM +0900, Michael Paquier wrote: >> With a large amount of >> shared buffer eviction you actually increase the risk of torn page >> reads. Instead of a logic relying on partition mapping locks, which >> could be unwise on performance grounds, did you consider different >> approaches? For example a kind of pre-emptive lock on the page in >> storage to prevent any shared buffer operation to happen while the >> block is read from storage, that would act like a barrier. > > Even with a workload having a large shared_buffers eviction pattern, I don't > think that there's a high probability of hitting a torn page. Unless I'm > mistaken it can only happen if all those steps happen concurrently to doing > the > block read just after releasing the LWLock: > > - postgres read the same block in shared_buffers (including all the locking) > - dirties it > - writes part of the page > > It's certainly possible, but it seems so unlikely that the optimistic > lock-less > approach seems like a very good tradeoff.
Having false reports in this area could be very confusing for the
user. That's for example possible now with checksum verification and
base backups.
>> I guess that this leads to the fact that this function may be better as
>> a contrib module, with the addition of some better-suited APIs in core
>> (see paragraph above).
>
> Below?
Above. This thought more precisely:
>> For example a kind of pre-emptive lock on the page in
>> storage to prevent any shared buffer operation to happen while the
>> block is read from storage, that would act like a barrier.
> For the record when I first tested that feature I did try to check dirty
> blocks, and it seemed that dirty blocks of shared relation were sometimes
> wrongly reported as corrupted. I didn't try to investigate more though.
Hmm. It would be good to look at that, correct verification of shared
relations matter.
>> + * - if a block is dirty in shared_buffers, it's ignored as it'll be
>> flushed to
>> + * disk either before the end of the next checkpoint or during recovery in
>> + * case of unsafe shutdown
>> Not sure that the indentation is going to react well on that part of
>> the patch, perhaps it would be better to add some "/*-------" at the
>> beginning and end of the comment block to tell pgindent to ignore this
>> part?
>
> Ok. Although I think only the beginning comment is needed?
From src/tools/pgindent/README:
"pgindent will reflow any comment block that's not at the left margin.
If this messes up manual formatting that ought to be preserved,
protect the comment block with some dashes:"
/*----------
* Text here will not be touched by pgindent.
*----------
*/
>> Based on the feedback gathered on this thread, I guess that you should
>> have a SRF returning the list of broken blocks, as well as NOTICE
>> messages.
>
> The current patch has an SRF and a WARNING message, do you want an additional
> NOTICE message or downgrade the existing one?
Right, not sure which one is better, for zero_damaged_pages a WARNING
is used.
--
Michael
signature.asc
Description: PGP signature
