Re: PL/pgSQL question about EXCEPTION clause & corrupt records

2020-02-17 Thread Jeremy Schneider
On 2/17/20 08:23, Jeremy Schneider wrote:
> FWIW, Bertrand blogged an even faster way to do this about a month ago -
> using pageinspect and processing blocks instead of rows
> 
> https://bdrouvot.wordpress.com/2020/01/18/retrieve-postgresql-variable-length-storage-information-thanks-to-pageinspect/
> 
> 
>> On Feb 17, 2020, at 03:32, Nick Renders  wrote:
>>
>> The problem is that I don't know which column is corrupt. But I found
>> a solution: by simply copying the record into another variable, the
>> values are parsed and the TOAST errors are thrown.
>>
>> In case anyone's interested, here's my code, based on an example from
>> http://www.databasesoup.com/2013/10/de-corrupting-toast-tables.html

Apologies - sent that slightly rushed email while I was taking the bus
to the office this morning and linked the wrong blog post :D

Here's the right one, showing a much faster way to identify which tuple
links to a bad toast row:

https://bdrouvot.wordpress.com/2020/01/04/get-toast-chunk_id-from-the-user-table-tuples-or-from-the-toast-index-thanks-to-pageinspect/

-Jeremy

-- 
http://about.me/jeremy_schneider




Re: PL/pgSQL question about EXCEPTION clause & corrupt records

2020-02-17 Thread Jeremy Schneider
FWIW, Bertrand blogged an even faster way to do this about a month ago - using 
pageinspect and processing blocks instead of rows

https://bdrouvot.wordpress.com/2020/01/18/retrieve-postgresql-variable-length-storage-information-thanks-to-pageinspect/

-J

Sent from my TI-83

> On Feb 17, 2020, at 03:32, Nick Renders  wrote:
> 
> The problem is that I don't know which column is corrupt. But I found a 
> solution: by simply copying the record into another variable, the values are 
> parsed and the TOAST errors are thrown.
> 
> In case anyone's interested, here's my code, based on an example from 
> http://www.databasesoup.com/2013/10/de-corrupting-toast-tables.html
> 
> 
> DO $f$
> DECLARE
>rContent1 record;
>rContent2 record;
>iCounter integer DEFAULT 1;
>iValue integer;
>pTableName varchar := 'f_gsxws_transaction';
>pFieldName varchar := 'gwta_number';
> BEGIN
>FOR iValue IN EXECUTE 'SELECT ' || pFieldName || ' FROM ' || 
> pTableName::regclass || ' ORDER BY ' || pFieldName LOOP
>BEGIN
>EXECUTE 'SELECT * FROM ' || pTableName::regclass || ' WHERE ' || 
> pFieldName || ' = $1'
>INTO rContent1
>USING iValue;
>rContent2 := rContent1;
>EXCEPTION WHEN OTHERS THEN
>RAISE NOTICE 'data for %.% % is corrupt', pTableName, pFieldName, 
> iValue;
>END;
>IF iCounter % 10 = 0 THEN
>RAISE NOTICE '% % records checked', iCounter, pTableName;
>END IF;
>iCounter := iCounter+1;
>END LOOP;
> END;
> $f$;
> 
> 
> Cheers,
> 
> Nick
> 
> 
>> On 14 Feb 2020, at 16:14, Tom Lane wrote:
>> 
>> "Nick Renders"  writes:
>>> I thought I would write a little PL script that would loop through all
>>> the data and report any inconsistencies. However, I can't get it to work
>>> properly.
>>> ...
>>> 1) The function has no problem executing the SELECT statement. It is
>>> only when "rcontents" is returned, that the function fails. This is a
>>> problem, because the ultimate goal is to loop through all records and
>>> only return/alert something in case of an error.
>>> 2) The function never enters the EXCEPTION clause. Instead, when it hits
>>> the RETURN command, it breaks and shows the same message as in pgAdmin:
>>> missing chunk number 0 for toast value 8289525 in pg_toast_5572299.
>> 
>> I think what's happening there is that the function doesn't try to
>> dereference the value's TOAST pointer during SELECT INTO.  It just stores
>> that pointer into a variable, and only sometime later when the actual
>> content of the value is demanded, do you see the error raised.
>> 
>> The solution to that is to do something that uses the contents of the
>> busted column right away while still inside the EXCEPTION block, perhaps
>> along the lines of "select md5(mycolumn) into local_variable from..."
>> 
>> A close reading of
>> 
>> https://www.postgresql.org/docs/current/storage-toast.html
>> 
>> would probably help you understand what's happening here.
>> 
>>regards, tom lane
> 
> 


Re: PL/pgSQL question about EXCEPTION clause & corrupt records

2020-02-17 Thread Nick Renders

Hi Jeremy,

This happend on PostgreSQL v9.6 which crashed 2 weeks ago.
Since then we have upgraded and restored our server, but my example is 
from the older, corrupt database.


Nick


On 15 Feb 2020, at 5:30, Jeremy Schneider wrote:


On Feb 14, 2020, at 04:39, Nick Renders  wrote:

I get the following message:

  ERROR:  missing chunk number 0 for toast value 8289525 in 
pg_toast_5572299


What version of PostgreSQL are you running? I’ve seen this a number 
of times the past couple years; curious if the lurking bug is still 
observed in latest versions.


-Jeremy

Sent from my TI-83





Re: PL/pgSQL question about EXCEPTION clause & corrupt records

2020-02-17 Thread Nick Renders
The problem is that I don't know which column is corrupt. But I found a 
solution: by simply copying the record into another variable, the values 
are parsed and the TOAST errors are thrown.


In case anyone's interested, here's my code, based on an example from 
http://www.databasesoup.com/2013/10/de-corrupting-toast-tables.html



DO $f$
DECLARE
rContent1 record;
rContent2 record;
iCounter integer DEFAULT 1;
iValue integer;
pTableName varchar := 'f_gsxws_transaction';
pFieldName varchar := 'gwta_number';
BEGIN
	FOR iValue IN EXECUTE 'SELECT ' || pFieldName || ' FROM ' || 
pTableName::regclass || ' ORDER BY ' || pFieldName LOOP

BEGIN
			EXECUTE 'SELECT * FROM ' || pTableName::regclass || ' WHERE ' || 
pFieldName || ' = $1'

INTO rContent1
USING iValue;
rContent2 := rContent1;
EXCEPTION WHEN OTHERS THEN
			RAISE NOTICE 'data for %.% % is corrupt', pTableName, pFieldName, 
iValue;

END;
IF iCounter % 10 = 0 THEN
RAISE NOTICE '% % records checked', iCounter, 
pTableName;
END IF;
iCounter := iCounter+1;
END LOOP;
END;
$f$;


Cheers,

Nick


On 14 Feb 2020, at 16:14, Tom Lane wrote:


"Nick Renders"  writes:
I thought I would write a little PL script that would loop through 
all
the data and report any inconsistencies. However, I can't get it to 
work

properly.
...
1) The function has no problem executing the SELECT statement. It is
only when "rcontents" is returned, that the function fails. This is a
problem, because the ultimate goal is to loop through all records and
only return/alert something in case of an error.
2) The function never enters the EXCEPTION clause. Instead, when it 
hits
the RETURN command, it breaks and shows the same message as in 
pgAdmin:

missing chunk number 0 for toast value 8289525 in pg_toast_5572299.


I think what's happening there is that the function doesn't try to
dereference the value's TOAST pointer during SELECT INTO.  It just 
stores

that pointer into a variable, and only sometime later when the actual
content of the value is demanded, do you see the error raised.

The solution to that is to do something that uses the contents of the
busted column right away while still inside the EXCEPTION block, 
perhaps

along the lines of "select md5(mycolumn) into local_variable from..."

A close reading of

https://www.postgresql.org/docs/current/storage-toast.html

would probably help you understand what's happening here.

regards, tom lane





Re: PL/pgSQL question about EXCEPTION clause & corrupt records

2020-02-14 Thread Jeremy Schneider


> On Feb 14, 2020, at 04:39, Nick Renders  wrote:
> 
> I get the following message:
> 
>   ERROR:  missing chunk number 0 for toast value 8289525 in pg_toast_5572299

What version of PostgreSQL are you running? I’ve seen this a number of times 
the past couple years; curious if the lurking bug is still observed in latest 
versions.

-Jeremy

Sent from my TI-83




Re: PL/pgSQL question about EXCEPTION clause & corrupt records

2020-02-14 Thread Tom Lane
"Nick Renders"  writes:
> I thought I would write a little PL script that would loop through all 
> the data and report any inconsistencies. However, I can't get it to work 
> properly.
> ...
> 1) The function has no problem executing the SELECT statement. It is 
> only when "rcontents" is returned, that the function fails. This is a 
> problem, because the ultimate goal is to loop through all records and 
> only return/alert something in case of an error.
> 2) The function never enters the EXCEPTION clause. Instead, when it hits 
> the RETURN command, it breaks and shows the same message as in pgAdmin: 
> missing chunk number 0 for toast value 8289525 in pg_toast_5572299.

I think what's happening there is that the function doesn't try to
dereference the value's TOAST pointer during SELECT INTO.  It just stores
that pointer into a variable, and only sometime later when the actual
content of the value is demanded, do you see the error raised.

The solution to that is to do something that uses the contents of the
busted column right away while still inside the EXCEPTION block, perhaps
along the lines of "select md5(mycolumn) into local_variable from..."

A close reading of

https://www.postgresql.org/docs/current/storage-toast.html

would probably help you understand what's happening here.

regards, tom lane




Re: PL/pgSQL question about EXCEPTION clause & corrupt records

2020-02-14 Thread Achilleas Mantzios

On 14/2/20 2:39 μ.μ., Nick Renders wrote:


Hello,

We recently suffered a database crash which resulted in some corrupt records.

I thought I would write a little PL script that would loop through all the data 
and report any inconsistencies. However, I can't get it to work properly.

For instance, if I run the following statement in pgAdmin:

SELECT * FROM f_gsxws_transaction WHERE gwta_number = 762513

I get the following message:

ERROR:  missing chunk number 0 for toast value 8289525 in pg_toast_5572299


So, as a test, I created a function that would just retrieve that one record:

DECLARE
    rcontent f_gsxws_transaction%ROWTYPE;
BEGIN
    SELECT * INTO rcontent FROM f_gsxws_transaction where gwta_number = 
762513;
    RETURN rcontent;
EXCEPTION WHEN OTHERS THEN
    RAISE NOTICE 'Record 762513 is corrupt';
END;


Now, when I run this function, I have noticed two things:

1) The function has no problem executing the SELECT statement. It is only when "rcontents" is returned, that the function fails. This is a problem, because the ultimate goal is to loop through all 
records and only return/alert something in case of an error.


2) The function never enters the EXCEPTION clause. Instead, when it hits the RETURN command, it breaks and shows the same message as in pgAdmin: missing chunk number 0 for toast value 8289525 in 
pg_toast_5572299.

Does the table have any PKs or UKs?
do something like

FOR vid IN SELECT  FROM f_gsxws_transaction where gwta_number = 762513 
ORDER BY  LOOP
    RAISE NOTICE 'examining row with = %',vid;
    select * into rcontent FROM f_gsxws_transaction where  = vid;
    RAISE NOTICE 'content of row = % , is % ',vid,rcontent;
END LOOP;




Is it possible to check for these kind of errors with a PL script? Or is there 
perhaps a better way to check for corrupt records in a database?

Best regards,

Nick Renders





--
Achilleas Mantzios
IT DEV Lead
IT DEPT
Dynacom Tankers Mgmt