Hi, Siva!

On Tue, Sep 4, 2018 at 11:01 PM R, Siva <sivas...@amazon.com> wrote:
> We recently encountered an issue where the opaque data flags on a gin data 
> leaf page was corrupted while replaying a gin insert WAL record. Upon further 
> examination of the redo code, we found a bug in ginRedoRecompress code, which 
> extracts the WAL information and updates the page.
>
> Specifically, when a new segment is inserted in the middle of a page, a 
> memmove operation is performed [1] at the current point in the page to make 
> room for the new segment. If this segment insertion is followed by delete 
> segment actions that are yet to be processed and the total data size is very 
> close to GinDataPageMaxDataSize, then we may move the data portion beyond the 
> boundary causing the opaque data to be corrupted.
>
> One way of solving this problem is to perform the replay work on a scratch 
> space, perform sanity check on the total size of the data portion before 
> copying it back to the actual page. While it involves additional memory 
> allocation and memcpy operations, it is safer and similar to the 'do' code 
> path where we ensure to make a copy of all segment past the first modified 
> segment before placing them back on the page [2].
>
> I have attached a patch for that approach here. Please let us know any 
> comments or feedback.

Do you have a test scenario for reproduction of this issue?  We need
it to ensure that fix is correct.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Reply via email to