Re: Print physical file path when checksum check fails

2020-03-19 Thread Hubert Zhang
I have updated the patch based on the previous comments. Sorry for the late
patch.

I removed `SetZeroDamagedPageInChecksum` and add `zeroDamagePage` flag in
smgrread to determine whether we should zero damage page when an error
happens. It depends on the caller.

`GetRelationFilePath` is removed as well. We print segno on the fly.

On Thu, Feb 20, 2020 at 2:33 PM Hubert Zhang  wrote:

> Thanks,
>
> On Thu, Feb 20, 2020 at 11:36 AM Andres Freund  wrote:
>
>> Hi,
>>
>> On 2020-02-19 16:48:45 +0900, Michael Paquier wrote:
>> > On Wed, Feb 19, 2020 at 03:00:54PM +0900, Kyotaro Horiguchi wrote:
>> > > I have had support requests related to broken block several times, and
>> > > (I think) most of *them* had hard time to locate the broken block or
>> > > even broken file.  I don't think it is useles at all, but I'm not sure
>> > > it is worth the additional complexity.
>> >
>> > I handle stuff like that from time to time, and any reports usually
>> > go down to people knowledgeable about PostgreSQL enough to know the
>> > difference.  My point is that in order to know where a broken block is
>> > physically located on disk, you need to know four things:
>> > - The block number.
>> > - The physical location of the relation.
>> > - The size of the block.
>> > - The length of a file segment.
>> > The first two items are printed in the error message, and you can
>> > guess easily the actual location (file, offset) with the two others.
>>
>> > I am not necessarily against improving the error message here, but
>> > FWIW I think that we need to consider seriously if the code
>> > complications and the maintenance cost involved are really worth
>> > saving from one simple calculation.
>>
>> I don't think it's that simple for most.
>>
>> And if we e.g. ever get the undo stuff merged, it'd get more
>> complicated, because they segment entirely differently. Similar, if we
>> ever manage to move SLRUs into the buffer pool and checksummed, it'd
>> again work differently.
>>
>> Nor is it architecturally appealing to handle checksums in multiple
>> places above the smgr layer: For one, that requires multiple places to
>> compute verify them. But also, as the way checksums are computed depends
>> on the page format etc, it'll likely change for things like undo/slru -
>> which then again will require additional smarts if done above the smgr
>> layer.
>>
>
> So considering undo staff, it's better to move checksum logic into md.c
> I will keep it in the new patch.
>
> On 2020-02-19 16:48:45 +0900, Michael Paquier wrote:
>
> > Particularly, quickly reading through the patch, I am rather unhappy
>> > about the shape of the second patch which pushes down the segment
>> > number knowledge into relpath.c, and creates more complication around
>> > the handling of zero_damaged_pages and zero'ed pages.  -- Michael
>>
>> I do not like the SetZeroDamagedPageInChecksum stuff at all however.
>>
>>
> I'm +1 on the first concern, and I will delete the new added function
> `GetRelationFilePath`
> in relpath.c and append the segno directly in error message inside
> function `VerifyPage`
>
> As for SetZeroDamagedPageInChecksum, the reason why I introduced it is
> that when we are doing
> smgrread() and one of the damaged page failed to pass the checksum check,
> we could not directly error
> out, since the caller of smgrread() may tolerate this error and just zero
> all the damaged page plus a warning message.
> Also, we could not just use GUC zero_damaged_pages to do this branch,
> since we also have ReadBufferMode(i.e. RBM_ZERO_ON_ERROR) to control it.
>
> To get rid of SetZeroDamagedPageInChecksum, one idea is to pass
> zero_damaged_page flag into smgrread(), something like below:
> ==
>
> extern void smgrread(SMgrRelation reln, ForkNumber forknum,
>
> BlockNumber blocknum, char *buffer, int flag);
>
> ===
>
>
> Any comments?
>
>
>
> --
> Thanks
>
> Hubert Zhang
>


-- 
Thanks

Hubert Zhang


0003-Print-physical-file-path-when-verify-checksum-failed.patch
Description: Binary data


Re: Print physical file path when checksum check fails

2020-02-19 Thread Hubert Zhang
Thanks Kyotaro,

On Wed, Feb 19, 2020 at 2:02 PM Kyotaro Horiguchi 
wrote:

> At Wed, 19 Feb 2020 13:28:04 +0900, Michael Paquier 
> wrote in
> > On Wed, Feb 19, 2020 at 01:07:36PM +0900, Kyotaro Horiguchi wrote:
> > > If we also verify checksum in md layer, callback is overkill since the
> > > immediate caller consumes the event immediately.  We can signal the
> > > error by somehow returning a file tag.
> >
> > FWIW, I am wondering if there is any need for a change here and
> > complicate more the code.  If you know the block number, the page size
> > and the segment file size you can immediately guess where is the
> > damaged block.  The first information is already part of the error
>
> I have had support requests related to broken block several times, and
> (I think) most of *them* had hard time to locate the broken block or
> even broken file.  I don't think it is useles at all, but I'm not sure
> it is worth the additional complexity.
>
> > damaged block.  The first information is already part of the error
> > message, and the two other ones are constants defined at
> > compile-time.
>
> May you have misread the snippet?
>
> What Hubert proposed is:
>
>  "invalid page in block %u of relation file %s; zeroing out page",
> blkno, 
>
> The second format in my messages just before is:
>   "invalid page in block %u in relation %u, file \"%s\"",
>  blockNum, smgr->smgr_rnode.node.relNode, smgrfname()
>
> All of them are not compile-time constant at all.
>
>
I like your error message, the block number is relation level not file
level.
I 'll change the error message to
"invalid page in block %u of relation %u, file %s"


-- 
Thanks

Hubert Zhang


Re: Print physical file path when checksum check fails

2020-02-19 Thread Hubert Zhang
Thanks,

On Thu, Feb 20, 2020 at 11:36 AM Andres Freund  wrote:

> Hi,
>
> On 2020-02-19 16:48:45 +0900, Michael Paquier wrote:
> > On Wed, Feb 19, 2020 at 03:00:54PM +0900, Kyotaro Horiguchi wrote:
> > > I have had support requests related to broken block several times, and
> > > (I think) most of *them* had hard time to locate the broken block or
> > > even broken file.  I don't think it is useles at all, but I'm not sure
> > > it is worth the additional complexity.
> >
> > I handle stuff like that from time to time, and any reports usually
> > go down to people knowledgeable about PostgreSQL enough to know the
> > difference.  My point is that in order to know where a broken block is
> > physically located on disk, you need to know four things:
> > - The block number.
> > - The physical location of the relation.
> > - The size of the block.
> > - The length of a file segment.
> > The first two items are printed in the error message, and you can
> > guess easily the actual location (file, offset) with the two others.
>
> > I am not necessarily against improving the error message here, but
> > FWIW I think that we need to consider seriously if the code
> > complications and the maintenance cost involved are really worth
> > saving from one simple calculation.
>
> I don't think it's that simple for most.
>
> And if we e.g. ever get the undo stuff merged, it'd get more
> complicated, because they segment entirely differently. Similar, if we
> ever manage to move SLRUs into the buffer pool and checksummed, it'd
> again work differently.
>
> Nor is it architecturally appealing to handle checksums in multiple
> places above the smgr layer: For one, that requires multiple places to
> compute verify them. But also, as the way checksums are computed depends
> on the page format etc, it'll likely change for things like undo/slru -
> which then again will require additional smarts if done above the smgr
> layer.
>

So considering undo staff, it's better to move checksum logic into md.c
I will keep it in the new patch.

On 2020-02-19 16:48:45 +0900, Michael Paquier wrote:

> Particularly, quickly reading through the patch, I am rather unhappy
> > about the shape of the second patch which pushes down the segment
> > number knowledge into relpath.c, and creates more complication around
> > the handling of zero_damaged_pages and zero'ed pages.  -- Michael
>
> I do not like the SetZeroDamagedPageInChecksum stuff at all however.
>
>
I'm +1 on the first concern, and I will delete the new added function
`GetRelationFilePath`
in relpath.c and append the segno directly in error message inside function
`VerifyPage`

As for SetZeroDamagedPageInChecksum, the reason why I introduced it is that
when we are doing
smgrread() and one of the damaged page failed to pass the checksum check,
we could not directly error
out, since the caller of smgrread() may tolerate this error and just zero
all the damaged page plus a warning message.
Also, we could not just use GUC zero_damaged_pages to do this branch, since
we also have ReadBufferMode(i.e. RBM_ZERO_ON_ERROR) to control it.

To get rid of SetZeroDamagedPageInChecksum, one idea is to pass
zero_damaged_page flag into smgrread(), something like below:
==

extern void smgrread(SMgrRelation reln, ForkNumber forknum,

BlockNumber blocknum, char *buffer, int flag);

===


Any comments?



-- 
Thanks

Hubert Zhang


Re: Print physical file path when checksum check fails

2020-02-19 Thread Andres Freund
Hi,

On 2020-02-19 16:48:45 +0900, Michael Paquier wrote:
> On Wed, Feb 19, 2020 at 03:00:54PM +0900, Kyotaro Horiguchi wrote:
> > I have had support requests related to broken block several times, and
> > (I think) most of *them* had hard time to locate the broken block or
> > even broken file.  I don't think it is useles at all, but I'm not sure
> > it is worth the additional complexity.
>
> I handle stuff like that from time to time, and any reports usually
> go down to people knowledgeable about PostgreSQL enough to know the
> difference.  My point is that in order to know where a broken block is
> physically located on disk, you need to know four things:
> - The block number.
> - The physical location of the relation.
> - The size of the block.
> - The length of a file segment.
> The first two items are printed in the error message, and you can
> guess easily the actual location (file, offset) with the two others.

> I am not necessarily against improving the error message here, but
> FWIW I think that we need to consider seriously if the code
> complications and the maintenance cost involved are really worth
> saving from one simple calculation.

I don't think it's that simple for most.

And if we e.g. ever get the undo stuff merged, it'd get more
complicated, because they segment entirely differently. Similar, if we
ever manage to move SLRUs into the buffer pool and checksummed, it'd
again work differently.

Nor is it architecturally appealing to handle checksums in multiple
places above the smgr layer: For one, that requires multiple places to
compute verify them. But also, as the way checksums are computed depends
on the page format etc, it'll likely change for things like undo/slru -
which then again will require additional smarts if done above the smgr
layer.


> Particularly, quickly reading through the patch, I am rather unhappy
> about the shape of the second patch which pushes down the segment
> number knowledge into relpath.c, and creates more complication around
> the handling of zero_damaged_pages and zero'ed pages.  -- Michael

I do not like the SetZeroDamagedPageInChecksum stuff at all however.

Greetings,

Andres Freund




Re: Print physical file path when checksum check fails

2020-02-18 Thread Michael Paquier
On Wed, Feb 19, 2020 at 03:00:54PM +0900, Kyotaro Horiguchi wrote:
> I have had support requests related to broken block several times, and
> (I think) most of *them* had hard time to locate the broken block or
> even broken file.  I don't think it is useles at all, but I'm not sure
> it is worth the additional complexity.

I handle stuff like that from time to time, and any reports usually
go down to people knowledgeable about PostgreSQL enough to know the
difference.  My point is that in order to know where a broken block is
physically located on disk, you need to know four things:
- The block number.
- The physical location of the relation.
- The size of the block.
- The length of a file segment.
The first two items are printed in the error message, and you can
guess easily the actual location (file, offset) with the two others.

I am not necessarily against improving the error message here, but
FWIW I think that we need to consider seriously if the code
complications and the maintenance cost involved are really worth
saving from one simple calculation.  Particularly, quickly reading
through the patch, I am rather unhappy about the shape of the second
patch which pushes down the segment number knowledge into relpath.c,
and creates more complication around the handling of
zero_damaged_pages and zero'ed pages.
--
Michael


signature.asc
Description: PGP signature


Re: Print physical file path when checksum check fails

2020-02-18 Thread Kyotaro Horiguchi
At Wed, 19 Feb 2020 13:28:04 +0900, Michael Paquier  wrote 
in 
> On Wed, Feb 19, 2020 at 01:07:36PM +0900, Kyotaro Horiguchi wrote:
> > If we also verify checksum in md layer, callback is overkill since the
> > immediate caller consumes the event immediately.  We can signal the
> > error by somehow returning a file tag.
> 
> FWIW, I am wondering if there is any need for a change here and
> complicate more the code.  If you know the block number, the page size
> and the segment file size you can immediately guess where is the
> damaged block.  The first information is already part of the error

I have had support requests related to broken block several times, and
(I think) most of *them* had hard time to locate the broken block or
even broken file.  I don't think it is useles at all, but I'm not sure
it is worth the additional complexity.

> damaged block.  The first information is already part of the error
> message, and the two other ones are constants defined at
> compile-time.

May you have misread the snippet?

What Hubert proposed is:

 "invalid page in block %u of relation file %s; zeroing out page",
blkno, 

The second format in my messages just before is:
  "invalid page in block %u in relation %u, file \"%s\"",
 blockNum, smgr->smgr_rnode.node.relNode, smgrfname()

All of them are not compile-time constant at all.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: Print physical file path when checksum check fails

2020-02-18 Thread Michael Paquier
On Wed, Feb 19, 2020 at 01:07:36PM +0900, Kyotaro Horiguchi wrote:
> If we also verify checksum in md layer, callback is overkill since the
> immediate caller consumes the event immediately.  We can signal the
> error by somehow returning a file tag.

FWIW, I am wondering if there is any need for a change here and
complicate more the code.  If you know the block number, the page size
and the segment file size you can immediately guess where is the
damaged block.  The first information is already part of the error
message, and the two other ones are constants defined at
compile-time.
--
Michael


signature.asc
Description: PGP signature


Re: Print physical file path when checksum check fails

2020-02-18 Thread Kyotaro Horiguchi
Hello. Thank you for the new patch.

At Tue, 18 Feb 2020 09:27:39 +0800, Hubert Zhang  wrote in 
> On Wed, Feb 12, 2020 at 5:22 PM Hubert Zhang  wrote:
> 
> > Thanks Andres,
> >
> > On Tue, Feb 11, 2020 at 5:30 AM Andres Freund  wrote:
> >
> >> HHi,
> >>
> >> On 2020-02-10 16:04:21 +0800, Hubert Zhang wrote:
> >> > Currently we only print block number and relation path when checksum
> >> check
> >> > fails. See example below:
> >> >
> >> > ERROR: invalid page in block 333571 of relation base/65959/656195
> >>
> >> > DBA complains that she needs additional work to calculate which physical
> >> > file is broken, since one physical file can only contain `RELSEG_SIZE`
> >> > number of blocks. For large tables, we need to use many physical files
> >> with
> >> > additional suffix, e.g. 656195.1, 656195.2 ...
> >> >
> >> > Is that a good idea to also print the physical file path in error
> >> message?
> >> > Like below:
> >> >
> >> > ERROR: invalid page in block 333571 of relation base/65959/656195, file
> >> > path base/65959/656195.2
> >>
> >> I think that'd be a nice improvement. But:
> >>
> >> I don't think the way you did it is right architecturally. The
> >> segmenting is really something that lives within md.c, and we shouldn't
> >> further expose it outside of that. And e.g. the undo patchset uses files
> >> with different segmentation - but still goes through bufmgr.c.
> >>
> >> I wonder if this partially signals that the checksum verification piece
> >> is architecturally done in the wrong place currently? It's imo not good
> >> that every place doing an smgrread() needs to separately verify
> >> checksums. OTOH, it doesn't really belong inside smgr.c.
> >>
> >>
> >> This layering issue was also encountered in
> >>
> >> https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=3eb77eba5a51780d5cf52cd66a9844cd4d26feb0
> >> so perhaps we should work to reuse the FileTag it introduces to
> >> represent segments, without hardcoding the specific segment size?
> >>
> >>
> > I checked the FileTag commit. It calls `register_xxx_segment` inside md.c
> > to store the sync request into a hashtable and used by checkpointer later.
> >
> > Checksum verify is simpler. We could move the `PageIsVerified` into md.c
> > (mdread). But we can not elog error inside md.c because read buffer mode
> > RBM_ZERO_ON_ERROR is at bugmgr.c level.
> >
> > One idea is to change save the error message(or the FileTag) at (e.g. a
> > static string in bufmgr.c) by calling `register_checksum_failure` inside
> > mdread in md.c.
> >
> > As for your concern about the need to do checksum verify after every
> > smgrread, we now move the checksum verify logic into md.c, but we still
> > need to check the checksum verify result after smgrread and reset buffer to
> > zero if mode is RBM_ZERO_ON_ERROR.
> >
> > If this idea is OK, I will submit the new PR.
> >
> >
> Based on Andres's comments, here is the new patch for moving checksum
> verify logic into mdread() instead of call PageIsVerified in every
> smgrread(). Also using FileTag to print the physical file name when
> checksum verify failed, which handle segmenting inside md.c as well.

The patch doesn't address the first comment from Andres. It still
expose the notion of segment to the upper layer, but bufmgr (bufmgr.c
and bufpage.c) or Realation (relpath.c) layer shouldn't know of
segment.  So the two layers should ask smgr without using segment
number for the real file name for the block.

For example, I think the following structure works. (Without moving
checksum verification.)

==
md.c:
  char *mdfname(SmgrRelation reln, Forknumber forkNum, BlockNumber blocknum);
smgr.c:
  char *smgrfname(SMgrRelation reln, ForkNumber forkNum, BlockNumber Blocknum);

bufmgr.c:
  ReadBuffer_common()
  {
  ..
/* check for garbage data */
if (!PageIsVerified((Page) bufBlock, blockNum))
if (mode == RBM_ZERO_ON_ERROR || 
zero_damaged_pages)
ereport(WARNING,
 errmsg("invalid page 
in block %u in file %s; zeroing out page",

blockNum,

smgrfname(smgr, forkNum, blockNum;
MemSet((char *) bufBlock, 0, BLCKSZ);


However, the block number in the error messages looks odd as it is
actually not the block number in the segment. We could convert
BlockNum into relative block number or offset in the file but it would
be overkill. So something like this works?

"invalid page in block %u in relation %u, file \"%s\"",
   blockNum, smgr->smgr_rnode.node.relNode, smgrfname()


If we also verify checksum in md layer, callback is overkill since the
immediate caller consumes the event immediately.  We can signal the
error by somehow returning a file tag.


Re: Print physical file path when checksum check fails

2020-02-17 Thread Hubert Zhang
On Wed, Feb 12, 2020 at 5:22 PM Hubert Zhang  wrote:

> Thanks Andres,
>
> On Tue, Feb 11, 2020 at 5:30 AM Andres Freund  wrote:
>
>> HHi,
>>
>> On 2020-02-10 16:04:21 +0800, Hubert Zhang wrote:
>> > Currently we only print block number and relation path when checksum
>> check
>> > fails. See example below:
>> >
>> > ERROR: invalid page in block 333571 of relation base/65959/656195
>>
>> > DBA complains that she needs additional work to calculate which physical
>> > file is broken, since one physical file can only contain `RELSEG_SIZE`
>> > number of blocks. For large tables, we need to use many physical files
>> with
>> > additional suffix, e.g. 656195.1, 656195.2 ...
>> >
>> > Is that a good idea to also print the physical file path in error
>> message?
>> > Like below:
>> >
>> > ERROR: invalid page in block 333571 of relation base/65959/656195, file
>> > path base/65959/656195.2
>>
>> I think that'd be a nice improvement. But:
>>
>> I don't think the way you did it is right architecturally. The
>> segmenting is really something that lives within md.c, and we shouldn't
>> further expose it outside of that. And e.g. the undo patchset uses files
>> with different segmentation - but still goes through bufmgr.c.
>>
>> I wonder if this partially signals that the checksum verification piece
>> is architecturally done in the wrong place currently? It's imo not good
>> that every place doing an smgrread() needs to separately verify
>> checksums. OTOH, it doesn't really belong inside smgr.c.
>>
>>
>> This layering issue was also encountered in
>>
>> https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=3eb77eba5a51780d5cf52cd66a9844cd4d26feb0
>> so perhaps we should work to reuse the FileTag it introduces to
>> represent segments, without hardcoding the specific segment size?
>>
>>
> I checked the FileTag commit. It calls `register_xxx_segment` inside md.c
> to store the sync request into a hashtable and used by checkpointer later.
>
> Checksum verify is simpler. We could move the `PageIsVerified` into md.c
> (mdread). But we can not elog error inside md.c because read buffer mode
> RBM_ZERO_ON_ERROR is at bugmgr.c level.
>
> One idea is to change save the error message(or the FileTag) at (e.g. a
> static string in bufmgr.c) by calling `register_checksum_failure` inside
> mdread in md.c.
>
> As for your concern about the need to do checksum verify after every
> smgrread, we now move the checksum verify logic into md.c, but we still
> need to check the checksum verify result after smgrread and reset buffer to
> zero if mode is RBM_ZERO_ON_ERROR.
>
> If this idea is OK, I will submit the new PR.
>
>
Based on Andres's comments, here is the new patch for moving checksum
verify logic into mdread() instead of call PageIsVerified in every
smgrread(). Also using FileTag to print the physical file name when
checksum verify failed, which handle segmenting inside md.c as well.

-- 
Thanks

Hubert Zhang


0002-Print-physical-file-path-when-verify-checksum-failed.patch
Description: Binary data


Re: Print physical file path when checksum check fails

2020-02-12 Thread Hubert Zhang
Thanks Andres,

On Tue, Feb 11, 2020 at 5:30 AM Andres Freund  wrote:

> HHi,
>
> On 2020-02-10 16:04:21 +0800, Hubert Zhang wrote:
> > Currently we only print block number and relation path when checksum
> check
> > fails. See example below:
> >
> > ERROR: invalid page in block 333571 of relation base/65959/656195
>
> > DBA complains that she needs additional work to calculate which physical
> > file is broken, since one physical file can only contain `RELSEG_SIZE`
> > number of blocks. For large tables, we need to use many physical files
> with
> > additional suffix, e.g. 656195.1, 656195.2 ...
> >
> > Is that a good idea to also print the physical file path in error
> message?
> > Like below:
> >
> > ERROR: invalid page in block 333571 of relation base/65959/656195, file
> > path base/65959/656195.2
>
> I think that'd be a nice improvement. But:
>
> I don't think the way you did it is right architecturally. The
> segmenting is really something that lives within md.c, and we shouldn't
> further expose it outside of that. And e.g. the undo patchset uses files
> with different segmentation - but still goes through bufmgr.c.
>
> I wonder if this partially signals that the checksum verification piece
> is architecturally done in the wrong place currently? It's imo not good
> that every place doing an smgrread() needs to separately verify
> checksums. OTOH, it doesn't really belong inside smgr.c.
>
>
> This layering issue was also encountered in
>
> https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=3eb77eba5a51780d5cf52cd66a9844cd4d26feb0
> so perhaps we should work to reuse the FileTag it introduces to
> represent segments, without hardcoding the specific segment size?
>
>
I checked the FileTag commit. It calls `register_xxx_segment` inside md.c
to store the sync request into a hashtable and used by checkpointer later.

Checksum verify is simpler. We could move the `PageIsVerified` into md.c
(mdread). But we can not elog error inside md.c because read buffer mode
RBM_ZERO_ON_ERROR is at bugmgr.c level.

One idea is to change save the error message(or the FileTag) at (e.g. a
static string in bufmgr.c) by calling `register_checksum_failure` inside
mdread in md.c.

As for your concern about the need to do checksum verify after every
smgrread, we now move the checksum verify logic into md.c, but we still
need to check the checksum verify result after smgrread and reset buffer to
zero if mode is RBM_ZERO_ON_ERROR.

If this idea is OK, I will submit the new PR.

Thanks

Hubert Zhang


Re: Print physical file path when checksum check fails

2020-02-10 Thread Andres Freund
HHi,

On 2020-02-10 16:04:21 +0800, Hubert Zhang wrote:
> Currently we only print block number and relation path when checksum check
> fails. See example below:
> 
> ERROR: invalid page in block 333571 of relation base/65959/656195

> DBA complains that she needs additional work to calculate which physical
> file is broken, since one physical file can only contain `RELSEG_SIZE`
> number of blocks. For large tables, we need to use many physical files with
> additional suffix, e.g. 656195.1, 656195.2 ...
> 
> Is that a good idea to also print the physical file path in error message?
> Like below:
> 
> ERROR: invalid page in block 333571 of relation base/65959/656195, file
> path base/65959/656195.2

I think that'd be a nice improvement. But:

I don't think the way you did it is right architecturally. The
segmenting is really something that lives within md.c, and we shouldn't
further expose it outside of that. And e.g. the undo patchset uses files
with different segmentation - but still goes through bufmgr.c.

I wonder if this partially signals that the checksum verification piece
is architecturally done in the wrong place currently? It's imo not good
that every place doing an smgrread() needs to separately verify
checksums. OTOH, it doesn't really belong inside smgr.c.


This layering issue was also encountered in
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=3eb77eba5a51780d5cf52cd66a9844cd4d26feb0
so perhaps we should work to reuse the FileTag it introduces to
represent segments, without hardcoding the specific segment size?

Regards,

Andres

> @@ -912,17 +912,20 @@ ReadBuffer_common(SMgrRelation smgr, char 
> relpersistence, ForkNumber forkNum,
>   {
>   ereport(WARNING,
>   
> (errcode(ERRCODE_DATA_CORRUPTED),
> -  errmsg("invalid page 
> in block %u of relation %s; zeroing out page",
> +  errmsg("invalid page 
> in block %u of relation %s, "
> + "file 
> path %s; zeroing out page",
>   
> blockNum,
> - 
> relpath(smgr->smgr_rnode, forkNum;
> + 
> relpath(smgr->smgr_rnode, forkNum),
> + 
> relfilepath(smgr->smgr_rnode, forkNum, blockNum;
>   MemSet((char *) bufBlock, 0, BLCKSZ);
>   }
>   else
>   ereport(ERROR,
>   
> (errcode(ERRCODE_DATA_CORRUPTED),
> -  errmsg("invalid page 
> in block %u of relation %s",
> +  errmsg("invalid page 
> in block %u of relation %s, file path %s",
>   
> blockNum,
> - 
> relpath(smgr->smgr_rnode, forkNum;
> + 
> relpath(smgr->smgr_rnode, forkNum),
> + 
> relfilepath(smgr->smgr_rnode, forkNum, blockNum;
>   }
>   }
>   }
> diff --git a/src/common/relpath.c b/src/common/relpath.c
> index ad733d1363..8b39c4ac4f 100644
> --- a/src/common/relpath.c
> +++ b/src/common/relpath.c
> @@ -208,3 +208,30 @@ GetRelationPath(Oid dbNode, Oid spcNode, Oid relNode,
>   }
>   return path;
>  }
> +
> +/*
> + * GetRelationFilePath - construct path to a relation's physical file
> + * given its block number.
> + */
> + char *
> +GetRelationFilePath(Oid dbNode, Oid spcNode, Oid relNode,
> + int backendId, ForkNumber forkNumber, 
> BlockNumber blkno)
> +{
> + char   *path;
> + char   *fullpath;
> + BlockNumber segno;
> +
> + path = GetRelationPath(dbNode, spcNode, relNode, backendId, forkNumber);
> +
> + segno = blkno / ((BlockNumber) RELSEG_SIZE);
> +
> + if (segno > 0)
> + {
> + fullpath = psprintf("%s.%u", path, segno);
> + pfree(path);
> + }
> + else
> + fullpath = path;
> +
> + return fullpath;
> +}

Greetings,

Andres Freund




Print physical file path when checksum check fails

2020-02-10 Thread Hubert Zhang
Hi hacker,

Currently we only print block number and relation path when checksum check
fails. See example below:

ERROR: invalid page in block 333571 of relation base/65959/656195

DBA complains that she needs additional work to calculate which physical
file is broken, since one physical file can only contain `RELSEG_SIZE`
number of blocks. For large tables, we need to use many physical files with
additional suffix, e.g. 656195.1, 656195.2 ...

Is that a good idea to also print the physical file path in error message?
Like below:

ERROR: invalid page in block 333571 of relation base/65959/656195, file
path base/65959/656195.2

Patch is attached.
-- 
Thanks

Hubert Zhang


0001-Print-physical-file-path-when-checksum-check-fails.patch
Description: Binary data