Re: data corruption hazard in reorderbuffer.c

Tomas Vondra Thu, 15 Jul 2021 15:32:28 -0700

Hi,

I think it's mostly futile to list all the possible issues this mighthave caused - if you skip arbitrary decoded changes, that can triggerpretty much any bug in reorder buffer. But those bugs can be triggeredby various other issues, of course.

It's hard to say what was the cause, but the "logic" bugs are probablypermanent, while the issues triggered by I/O probably disappear after arestart?

That being said, I agree this seems like an issue and we should notignore I/O errors. I'd bet other places using transient files (likesorting or hashagg spilling to disk) has the same issue, although inthat case the impact is likely limited to a single query.

I wonder if sync before the close is an appropriate solution, though. Itseems rather expensive, and those files are meant to be "temporary"(i.e. we don't keep them over restart). So maybe we could ensure theconsistency is a cheaper way - perhaps tracking some sort of checksumfor each file, or something like that?



regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: data corruption hazard in reorderbuffer.c

Reply via email to