Thank you for looking into this Jakub, Aleksander, Michael, Tom,

There are more scenario's where files might be missing: how about a backup 
somehow missing files?
But also deliberate sabotage, you can hide data in this way, and it would be 
hard to detect, and even
harder to understand what happened, why it happened and how it happened. I dare 
to say quite ideal
if you want a name to be removed from a database a covert way, and know 
something about the
database and have access, or somehow can manipulate in the data directory.

Without having the exact sourcecode files and functions handy, the way I think 
using the segments
works in postgres, is that the database, because of the absence of any 
indicator of size, will scan 
the relfilenode, and when it reaches the set limit (1GB), it will try to open 
the next segment. If that 
segment doesn't exist, the database assumes it's the end of the segment. That 
is all. That is why there
is no error: it's exactly alike when there truly is no more data, and there is 
no metadata to understand
there is supposed to be more data.

And therefore I talked about the primary key: that contains "indirect" data to 
be able to detect if a
table has a missing segment. But I think really what is needed is a max page 
number to indicate
the known size.



Frits Hoogland




> On 1 Oct 2025, at 13:20, Jakub Wartak <[email protected]> wrote:
> 
> On Wed, Oct 1, 2025 at 9:02 AM Michael Banck <[email protected]> wrote:
>> 
>> Hi,
>> 
>> wow, this is one of the most terrifying threads I've ever seen...
> 
> Same.
> 
>> On Tue, Sep 30, 2025 at 12:41:29PM -0400, Tom Lane wrote:
>>> Aleksander Alekseev <[email protected]> writes:
>>>>> Therefore, I would like to request an enhancement: add an option to
>>>>> verify_heapam() that causes the primary key index to be scanned and makes
>>>>> sure that all line pointers in the index point to existing tuples.
>>> 
>>>> ... IMO there is little value in adding a check for the existence of
>>>> the segments for a single table. And the *real* check will not differ
>>>> much from something like SELECT * FROM my_table, or from making a
>>>> complete backup of the database.
>>> 
>>> As Frits mentioned, neither of those actions will really notice if a
>>> table has been truncated via loss of a segment.
>> 
>> Is there a valid case for a missing segment? If not, couldn't this be
>> caught somewhere in the storage manager?
>> 
> 
> I've took a look on PG17 and in _mfd_openseg() there's if fd < 0
> return NULL after open(), but out of it's callers only _mdfd_getseg()
> seems to be alerting on that NULL. To me this seems like a bug,
> because i've seen way too many times people and software deleting
> files randomly. Even simple crashes (with e2fsck, xfs_repair) could
> put orphaned inodes into /lost+found. IMHO all files should be opened
> at least on startup to check integrity, because the non-zero return
> code (during such SELECT) for openat(2) seems o be coming out of
> RelationGetNumberOfBlocksInFork()->table_block_relation_size()->smgrnblocks()->mdnblocks()->_mdfd_openseg().
> Now if the 1st seg file would be missing we would complain in
> mdopenfork(). mdnblocks() says even "all active segments of the
> relation are opened...", but even that apparently is not true.
> 
> The bigger context seems to be be that 049469e7e7cfe0c69 (2015) could
> be culprit here as well, as it is stated there that mdnblocks() could
> earlier create zero-length files back in day and it removed that
> ereport(ERROR) when unable to access that file.
> 
> Another idea (than this being a bug) is that Thomas had a large
> relation patchset back in [1], but I wouldn't be a fan of us operating
> on 31-32TB files ;)
> 
> -J.
> 
> [1] - 
> https://www.postgresql.org/message-id/flat/CA%2BhUKG%2BBGXwMbrvzXAjL8VMGf25y_ga_XnO741g10y0%3Dm6dDiA%40mail.gmail.com

Reply via email to