Re: [PATCH] Infinite loop while acquiring new TOAST Oid

2023-09-04 Thread Aleksander Alekseev
Hi,

> Aleksander, thank you for reminding me of this patch, try to do it in a few 
> days.

A consensus was reached [1] to mark this patch as RwF for now. There
are many patches to be reviewed and this one doesn't seem to be in the
best shape, so we have to prioritise. Please feel free re-submitting
the patch for the next commitfest.

[1]: https://postgr.es/m/0737f444-59bb-ac1d-2753-873c40da0840%40eisentraut.org

-- 
Best regards,
Aleksander Alekseev




Re: [PATCH] Infinite loop while acquiring new TOAST Oid

2023-07-11 Thread Nikita Malakhov
Hi!

Aleksander, thank you for reminding me of this patch, try to do it in a few
days.

--
Regards,
Nikita Malakhov
Postgres Professional
The Russian Postgres Company
https://postgrespro.ru/


Re: [PATCH] Infinite loop while acquiring new TOAST Oid

2023-07-10 Thread Aleksander Alekseev
Hi Nikita,

> I've prepared a patch with a 64-bit TOAST Value ID. It is a kind of prototype
> and needs some further work, but it is already working and ready to play with.

Unfortunately the patch rotted a bit. Could you please submit an
updated/rebased patch so that it could be reviewed in the scope of
July commitfest?

-- 
Best regards,
Aleksander Alekseev




Re: [PATCH] Infinite loop while acquiring new TOAST Oid

2023-04-27 Thread Nikita Malakhov
Hi!

Widening of a TOAST pointer is possible if we keep backward compatibility
with
old-fashioned TOAST tables - I mean differ 'long' and 'short' TOAST
pointers and
process them accordingly on insert and delete cases, and vacuum with logical
replication. It is not very difficult, however it takes some effort.
Recently I've found
out that I have not overseen all compatibility cases, so the provided patch
is
functional but limited in compatibility.

We already have a flag byte in the TOAST pointer which is responsible for
the type
of the pointer - va_flag field. It was explained in the Pluggable TOAST
patch.
One is enough, there is no need to add another one.


On Wed, Apr 26, 2023 at 3:55 PM Aleksander Alekseev <
aleksan...@timescale.com> wrote:

> Hi,
>
> > I agree that we can't simply widen varatt_external to use 8 bytes for
> > the toast ID in all cases.
>
> +1
>
> Note that the user may have a table with multiple TOASTable
> attributes. If we simply widen the TOAST pointer it may break the
> existing tables in the edge case. Also this may be a reason why
> certain users may prefer to continue using narrow pointers. IMO wide
> TOAST pointers should be a table option. Whether the default for new
> tables should be wide or narrow pointers is debatable.
>
> In another discussion [1] we seem to agree that we also want to have
> an ability to include a 32-bit dictionary_id to the TOAST pointers and
> perhaps support more compression methods (ZSTD to name one). Besides
> that it would be nice to have an ability to extend TOAST pointers in
> the future without breaking the existing pointers. One possible
> solution would be to add a varint feature bitmask to every pointer. So
> we could add flags like TOAST_IS_WIDE, TOAST_HAS_DICTIONARY,
> TOAST_UNKNOWN_FEATURE_FROM_2077, etc indefinitely.
>
> I suggest we address all the current and future needs once and
> completely refactor TOAST pointers rather than solving one problem at
> a time. I believe this will be more beneficial for the community in
> the long term.
>
> Thoughts?
>
> [1]: https://commitfest.postgresql.org/43/3626/
>
> --
> Best regards,
> Aleksander Alekseev
>


-- 
Regards,
Nikita Malakhov
Postgres Professional
The Russian Postgres Company
https://postgrespro.ru/


Re: [PATCH] Infinite loop while acquiring new TOAST Oid

2023-04-26 Thread Aleksander Alekseev
Hi,

> I agree that we can't simply widen varatt_external to use 8 bytes for
> the toast ID in all cases.

+1

Note that the user may have a table with multiple TOASTable
attributes. If we simply widen the TOAST pointer it may break the
existing tables in the edge case. Also this may be a reason why
certain users may prefer to continue using narrow pointers. IMO wide
TOAST pointers should be a table option. Whether the default for new
tables should be wide or narrow pointers is debatable.

In another discussion [1] we seem to agree that we also want to have
an ability to include a 32-bit dictionary_id to the TOAST pointers and
perhaps support more compression methods (ZSTD to name one). Besides
that it would be nice to have an ability to extend TOAST pointers in
the future without breaking the existing pointers. One possible
solution would be to add a varint feature bitmask to every pointer. So
we could add flags like TOAST_IS_WIDE, TOAST_HAS_DICTIONARY,
TOAST_UNKNOWN_FEATURE_FROM_2077, etc indefinitely.

I suggest we address all the current and future needs once and
completely refactor TOAST pointers rather than solving one problem at
a time. I believe this will be more beneficial for the community in
the long term.

Thoughts?

[1]: https://commitfest.postgresql.org/43/3626/

-- 
Best regards,
Aleksander Alekseev




Re: [PATCH] Infinite loop while acquiring new TOAST Oid

2023-04-24 Thread Nikita Malakhov
Hi!

No, it wasn't. It was a proposal, I thought I'd get some feedback on it
before sending it to commitfest.

On Sat, Apr 22, 2023 at 6:17 PM Gurjeet Singh  wrote:

> On Thu, Dec 22, 2022 at 10:07 AM Nikita Malakhov 
> wrote:
> > Any suggestions on the previous message (64-bit toast value ID with
> individual counters)?
>
> Was this patch ever added to CommitFest? I don't see it in the current
> Open Commitfest.
>
> https://commitfest.postgresql.org/43/
>
> Best regards,
> Gurjeet http://Gurje.et
> http://aws.amazon.com
>


-- 
Regards,
Nikita Malakhov
Postgres Professional
The Russian Postgres Company
https://postgrespro.ru/


Re: [PATCH] Infinite loop while acquiring new TOAST Oid

2023-04-22 Thread Gurjeet Singh
On Thu, Dec 22, 2022 at 10:07 AM Nikita Malakhov  wrote:
> Any suggestions on the previous message (64-bit toast value ID with 
> individual counters)?

Was this patch ever added to CommitFest? I don't see it in the current
Open Commitfest.

https://commitfest.postgresql.org/43/

Best regards,
Gurjeet http://Gurje.et
http://aws.amazon.com




Re: [PATCH] Infinite loop while acquiring new TOAST Oid

2022-12-22 Thread Nikita Malakhov
Hi hackers!

Any suggestions on the previous message (64-bit toast value ID with
individual counters)?

On Tue, Dec 13, 2022 at 1:41 PM Nikita Malakhov  wrote:

> Hi hackers!
>
> I've prepared a patch with a 64-bit TOAST Value ID. It is a kind of
> prototype
> and needs some further work, but it is already working and ready to play
> with.
>
> I've introduced 64-bit value ID field in varatt_external, but to keep it
> compatible
> with older bases 64-bit value is stored as a structure of two 32-bit
> fields and value
> ID moved to be the last in the varatt_external structure. Also, I've
> introduced
> different ID assignment function - instead of using global OID assignment
> function
> I use individual counters for each TOAST table, automatically cached and
> after
> server restart when new value is inserted into TOAST table maximum used
> value
> is searched and used to assign the next one.
>
> >Andres Freund:
> >I think we'll need to do something about the width of varatt_external to
> make
> >the conversion to 64bit toast oids viable - and if we do, I don't think
> >there's a decent argument for staying with 4 byte toast OIDs. I think the
> >varatt_external equivalent would end up being smaller in just about all
> cases.
> >And as you said earlier, the increased overhead inside the toast table /
> index
> >is not relevant compared to the size of toasted datums.
>
> There is some small overhead due to indexing 64-bit values. Also, for
> smaller
> tables we can use 32-bit ID instead of 64-bit, but then we would have a
> problem
> when we reach the limit of 2^32.
>
> Pg_upgrade seems to be not a very difficult case if we go keeping old TOAST
> tables using 32-bit values,
>
> Please have a look. I'd be grateful for some further directions.
>
> GIT branch with this feature, rebased onto current master:
> https://github.com/postgrespro/postgres/tree/toast_64bit
>
>
> --
Regards,
Nikita Malakhov
Postgres Professional
https://postgrespro.ru/


Re: [PATCH] Infinite loop while acquiring new TOAST Oid

2022-12-13 Thread Nikita Malakhov
Hi hackers!

I've prepared a patch with a 64-bit TOAST Value ID. It is a kind of
prototype
and needs some further work, but it is already working and ready to play
with.

I've introduced 64-bit value ID field in varatt_external, but to keep it
compatible
with older bases 64-bit value is stored as a structure of two 32-bit fields
and value
ID moved to be the last in the varatt_external structure. Also, I've
introduced
different ID assignment function - instead of using global OID assignment
function
I use individual counters for each TOAST table, automatically cached and
after
server restart when new value is inserted into TOAST table maximum used
value
is searched and used to assign the next one.

>Andres Freund:
>I think we'll need to do something about the width of varatt_external to
make
>the conversion to 64bit toast oids viable - and if we do, I don't think
>there's a decent argument for staying with 4 byte toast OIDs. I think the
>varatt_external equivalent would end up being smaller in just about all
cases.
>And as you said earlier, the increased overhead inside the toast table /
index
>is not relevant compared to the size of toasted datums.

There is some small overhead due to indexing 64-bit values. Also, for
smaller
tables we can use 32-bit ID instead of 64-bit, but then we would have a
problem
when we reach the limit of 2^32.

Pg_upgrade seems to be not a very difficult case if we go keeping old TOAST
tables using 32-bit values,

Please have a look. I'd be grateful for some further directions.

GIT branch with this feature, rebased onto current master:
https://github.com/postgrespro/postgres/tree/toast_64bit

On Wed, Dec 7, 2022 at 12:00 AM Nikita Malakhov  wrote:

> Hi hackers!
>
> Here's some update on the subject - I've made a branch based on master from
> 28/11 and introduced 64-bit TOAST value ID there. It is not complete now
> but
> is already working and has some features:
> - extended structure for TOAST pointer (varatt_long_external) with 64-bit
> TOAST value ID;
> - individual ID counters for each TOAST table, being cached for faster
> access
> and lookup of maximum TOAST ID in use after server restart.
> https://github.com/postgrespro/postgres/tree/toast_64bit
>
> --
> Regards,
> Nikita Malakhov
> Postgres Professional
> https://postgrespro.ru/
>


-- 
Regards,
Nikita Malakhov
Postgres Professional
https://postgrespro.ru/


0001_64_bit_toast_valueid_v1.patch
Description: Binary data


Re: [PATCH] Infinite loop while acquiring new TOAST Oid

2022-12-06 Thread Nikita Malakhov
Hi hackers!

Here's some update on the subject - I've made a branch based on master from
28/11 and introduced 64-bit TOAST value ID there. It is not complete now but
is already working and has some features:
- extended structure for TOAST pointer (varatt_long_external) with 64-bit
TOAST value ID;
- individual ID counters for each TOAST table, being cached for faster
access
and lookup of maximum TOAST ID in use after server restart.
https://github.com/postgrespro/postgres/tree/toast_64bit

--
Regards,
Nikita Malakhov
Postgres Professional
https://postgrespro.ru/


Re: [PATCH] Infinite loop while acquiring new TOAST Oid

2022-11-29 Thread Nikita Malakhov
Hi!

I'm working on this issue according to the plan Tom proposed above -

>I agree that we can't simply widen varatt_external to use 8 bytes for
>the toast ID in all cases.  Also, I now get the point about avoiding
>use of globally assigned OIDs here: if the counter starts from zero
>for each table, then a variable-width varatt_external could actually
>be smaller than currently for many cases.  However, that bit is somewhat
>orthogonal, and it's certainly not required for fixing the basic problem.

Have I understood correctly that you suppose using an individual counter
for each TOAST table? I'm working on this approach, so we store counters
in cache, but I see an issue with the first insert - when there is no
counter
in cache so we have to loop through the table with increasing steps to find
available one (i.e. after restart). Or we still use single global counter,
but
64-bit with a wraparound?

>So it seems like the plan of attack ought to be:

>1. Invent a new form or forms of varatt_external to allow different
>widths of the toast ID.  Use the narrowest width possible for any
>given ID value.

I'm using the VARTAG field - there are plenty of available values, so there
is no problem in distinguishing regular toast pointer with 'short' value id
(4 bytes) from long (8 bytes).

varatt_external currently is 32-bit aligned, so there is no reason in using
narrower type for value ids up to 16 bits.Or is it?

>2. Allow TOAST tables/indexes to store either 4-byte or 8-byte IDs.
>(Conversion could be done as a side effect of table-rewrite
>operations, perhaps.)

Still on toast/detoast part, would get to that later.

>3. Localize ID selection so that tables can have small toast IDs
>even when other tables have many IDs.  (Optional, could do later.)

>
Thank you!

-- 
Regards,
Nikita Malakhov
Postgres Professional
https://postgrespro.ru/


Re: [PATCH] Infinite loop while acquiring new TOAST Oid

2022-11-28 Thread Nikita Malakhov
I've missed that -

>4 billion external datums
>typically use a lot of space.

Not quite so. It's 8 Tb for the minimal size of toasted data (about 2 Kb).
In my practice tables with more than 5 billions of rows are not of
something out
of the ordinary (highly loaded databases with large amounts of data in use).

On Tue, Nov 29, 2022 at 1:12 AM Nikita Malakhov  wrote:

> Hi,
>
> I'll check that tomorrow. If it is so then there won't be a problem keeping
> old tables without re-toasting.
>
> On Tue, Nov 29, 2022 at 1:10 AM Andres Freund  wrote:
>
>> Hi,
>>
>> On 2022-11-28 16:57:53 -0500, Tom Lane wrote:
>> > As I said before, I think there's a decent argument that some people
>> > will want the option to stay with 4-byte TOAST OIDs indefinitely,
>> > at least for smaller tables.
>>
>> I think we'll need to do something about the width of varatt_external to
>> make
>> the conversion to 64bit toast oids viable - and if we do, I don't think
>> there's a decent argument for staying with 4 byte toast OIDs. I think the
>> varatt_external equivalent would end up being smaller in just about all
>> cases.
>> And as you said earlier, the increased overhead inside the toast table /
>> index
>> is not relevant compared to the size of toasted datums.
>>
>> Greetings,
>>
>> Andres Freund
>>
>
>
> --
> Regards,
> Nikita Malakhov
> Postgres Professional
> https://postgrespro.ru/
>


-- 
Regards,
Nikita Malakhov
Postgres Professional
https://postgrespro.ru/


Re: [PATCH] Infinite loop while acquiring new TOAST Oid

2022-11-28 Thread Tom Lane
Andres Freund  writes:
> On 2022-11-28 16:57:53 -0500, Tom Lane wrote:
>> As I said before, I think there's a decent argument that some people
>> will want the option to stay with 4-byte TOAST OIDs indefinitely,
>> at least for smaller tables.

> And as you said earlier, the increased overhead inside the toast table / index
> is not relevant compared to the size of toasted datums.

Perhaps not.

> I think we'll need to do something about the width of varatt_external to make
> the conversion to 64bit toast oids viable - and if we do, I don't think
> there's a decent argument for staying with 4 byte toast OIDs. I think the
> varatt_external equivalent would end up being smaller in just about all cases.

I agree that we can't simply widen varatt_external to use 8 bytes for
the toast ID in all cases.  Also, I now get the point about avoiding
use of globally assigned OIDs here: if the counter starts from zero
for each table, then a variable-width varatt_external could actually
be smaller than currently for many cases.  However, that bit is somewhat
orthogonal, and it's certainly not required for fixing the basic problem.

So it seems like the plan of attack ought to be:

1. Invent a new form or forms of varatt_external to allow different
widths of the toast ID.  Use the narrowest width possible for any
given ID value.

2. Allow TOAST tables/indexes to store either 4-byte or 8-byte IDs.
(Conversion could be done as a side effect of table-rewrite
operations, perhaps.)

3. Localize ID selection so that tables can have small toast IDs
even when other tables have many IDs.  (Optional, could do later.)

regards, tom lane




Re: [PATCH] Infinite loop while acquiring new TOAST Oid

2022-11-28 Thread Nikita Malakhov
Hi,

I'll check that tomorrow. If it is so then there won't be a problem keeping
old tables without re-toasting.

On Tue, Nov 29, 2022 at 1:10 AM Andres Freund  wrote:

> Hi,
>
> On 2022-11-28 16:57:53 -0500, Tom Lane wrote:
> > As I said before, I think there's a decent argument that some people
> > will want the option to stay with 4-byte TOAST OIDs indefinitely,
> > at least for smaller tables.
>
> I think we'll need to do something about the width of varatt_external to
> make
> the conversion to 64bit toast oids viable - and if we do, I don't think
> there's a decent argument for staying with 4 byte toast OIDs. I think the
> varatt_external equivalent would end up being smaller in just about all
> cases.
> And as you said earlier, the increased overhead inside the toast table /
> index
> is not relevant compared to the size of toasted datums.
>
> Greetings,
>
> Andres Freund
>


-- 
Regards,
Nikita Malakhov
Postgres Professional
https://postgrespro.ru/


Re: [PATCH] Infinite loop while acquiring new TOAST Oid

2022-11-28 Thread Andres Freund
Hi,

On 2022-11-28 16:57:53 -0500, Tom Lane wrote:
> As I said before, I think there's a decent argument that some people
> will want the option to stay with 4-byte TOAST OIDs indefinitely,
> at least for smaller tables.

I think we'll need to do something about the width of varatt_external to make
the conversion to 64bit toast oids viable - and if we do, I don't think
there's a decent argument for staying with 4 byte toast OIDs. I think the
varatt_external equivalent would end up being smaller in just about all cases.
And as you said earlier, the increased overhead inside the toast table / index
is not relevant compared to the size of toasted datums.

Greetings,

Andres Freund




Re: [PATCH] Infinite loop while acquiring new TOAST Oid

2022-11-28 Thread Tom Lane
Andres Freund  writes:
> On 2022-11-29 00:24:49 +0300, Nikita Malakhov wrote:
>> 2) Upgrading to 64-bit id would require re-toasting old TOAST tables. Or
>> there is some way to distinguish old tables from new ones?

> The catalog / relcache entry should suffice to differentiate between the two.

Yeah, you could easily look at the datatype of the first attribute
(in either the TOAST table or its index) to determine what to do.

As I said before, I think there's a decent argument that some people
will want the option to stay with 4-byte TOAST OIDs indefinitely,
at least for smaller tables.  So even without the fact that forced
conversions would be horridly expensive, we'll need to continue
support for both forms of TOAST table.

regards, tom lane




Re: [PATCH] Infinite loop while acquiring new TOAST Oid

2022-11-28 Thread Andres Freund
Hi,

On 2022-11-29 00:24:49 +0300, Nikita Malakhov wrote:
> 2) Upgrading to 64-bit id would require re-toasting old TOAST tables. Or
> there is some way to distinguish old tables from new ones?

The catalog / relcache entry should suffice to differentiate between the two.

Greetings,

Andres Freund




Re: [PATCH] Infinite loop while acquiring new TOAST Oid

2022-11-28 Thread Tom Lane
Andres Freund  writes:
> On 2022-11-28 16:04:12 -0500, Tom Lane wrote:
>> And I don't buy that either.  An extra 4 bytes with a 2K payload is not
>> "prohibitive", it's more like "down in the noise".

> The space usage for the the the toast relation + index itself is indeed
> irrelevant. Where it's not "down in the noise" is in struct varatt_external,
> i.e. references to external toast datums.

Ah, gotcha.  Yes, the size of varatt_external is a problem.

regards, tom lane




Re: [PATCH] Infinite loop while acquiring new TOAST Oid

2022-11-28 Thread Nikita Malakhov
Hi,

Andres Freund  writes:
>Was the issue that you exceeded 4 billion toasted datums, or that
assignment
>took a long time? How many toast datums did you actually have? Was this
due to
>very wide rows leading to even small datums getting toasted?

Yep, we had 4 billion toasted datums. I remind that currently relation has
single
TOAST table for all toastable attributes, so it is not so difficult to get
to 4 billion
of toasted values.

>I think if we switch to int8 keys and widen the global OID counter to 8
>bytes (using just the low 4 bytes for other purposes), we'll have a
>perfectly fine solution.  There is still plenty of work to be done under
>that plan, because of the need to maintain backward compatibility for
>existing TOAST tables --- and maybe people would want an option to keep on
>using them, for non-enormous tables?  If we add additional work on top of
>that, it'll just mean that it will take longer to have any solution.

I agree, but:
1) Global OID counter is used not only for TOAST, so there would be a lot of
places where the short counter (low part of 64 OID, if we go with that) is
used;
2) Upgrading to 64-bit id would require re-toasting old TOAST tables. Or
there
is some way to distinguish old tables from new ones?

But I don't see any reason to keep an old short ID as an option.

...
>Yeah, that is completely horrid.  It does not remove the existing failure
>mode, just changes it to have worse consequences.

On Tue, Nov 29, 2022 at 12:04 AM Tom Lane  wrote:

> Andres Freund  writes:
> > I think the first step to improve the situation is to not use a global
> oid
> > counter for toasted values. One way to do that would be to use the
> sequence
> > code to do oid assignment, but we likely can find a more efficient
> > representation.
>
> I don't particularly buy that, because the only real fix is this:
>
> > Eventually we should do the obvious thing and make toast ids 64bit wide
>
> and if we do that there'll be no need to worry about multiple counters.
>
> > - to
> > combat the space usage we likely should switch to representing the ids as
> > variable width integers or such, otherwise the space increase would
> likely be
> > prohibitive.
>
> And I don't buy that either.  An extra 4 bytes with a 2K payload is not
> "prohibitive", it's more like "down in the noise".
>
> I think if we switch to int8 keys and widen the global OID counter to 8
> bytes (using just the low 4 bytes for other purposes), we'll have a
> perfectly fine solution.  There is still plenty of work to be done under
> that plan, because of the need to maintain backward compatibility for
> existing TOAST tables --- and maybe people would want an option to keep on
> using them, for non-enormous tables?  If we add additional work on top of
> that, it'll just mean that it will take longer to have any solution.
>
> >> Quick fix for this problem is limiting GetNewOidWithIndex loops to some
> >> reasonable amount defined by related macro and returning error if there
> is
> >> still no available Oid. Please check attached patch, any feedback is
> >> appreciated.
>
> > This feels like the wrong spot to tackle the issue.
>
> Yeah, that is completely horrid.  It does not remove the existing failure
> mode, just changes it to have worse consequences.
>
> regards, tom lane
>


-- 
Regards,
Nikita Malakhov
Postgres Professional
https://postgrespro.ru/


Re: [PATCH] Infinite loop while acquiring new TOAST Oid

2022-11-28 Thread Andres Freund
Hi,

On 2022-11-28 16:04:12 -0500, Tom Lane wrote:
> Andres Freund  writes:
> > - to
> > combat the space usage we likely should switch to representing the ids as
> > variable width integers or such, otherwise the space increase would likely 
> > be
> > prohibitive.
> 
> And I don't buy that either.  An extra 4 bytes with a 2K payload is not
> "prohibitive", it's more like "down in the noise".

The space usage for the the the toast relation + index itself is indeed
irrelevant. Where it's not "down in the noise" is in struct varatt_external,
i.e. references to external toast datums. The size of that already is an
issue.

Greetings,

Andres Freund




Re: [PATCH] Infinite loop while acquiring new TOAST Oid

2022-11-28 Thread Tom Lane
Andres Freund  writes:
> I think the first step to improve the situation is to not use a global oid
> counter for toasted values. One way to do that would be to use the sequence
> code to do oid assignment, but we likely can find a more efficient
> representation.

I don't particularly buy that, because the only real fix is this:

> Eventually we should do the obvious thing and make toast ids 64bit wide

and if we do that there'll be no need to worry about multiple counters.

> - to
> combat the space usage we likely should switch to representing the ids as
> variable width integers or such, otherwise the space increase would likely be
> prohibitive.

And I don't buy that either.  An extra 4 bytes with a 2K payload is not
"prohibitive", it's more like "down in the noise".

I think if we switch to int8 keys and widen the global OID counter to 8
bytes (using just the low 4 bytes for other purposes), we'll have a
perfectly fine solution.  There is still plenty of work to be done under
that plan, because of the need to maintain backward compatibility for
existing TOAST tables --- and maybe people would want an option to keep on
using them, for non-enormous tables?  If we add additional work on top of
that, it'll just mean that it will take longer to have any solution.

>> Quick fix for this problem is limiting GetNewOidWithIndex loops to some
>> reasonable amount defined by related macro and returning error if there is
>> still no available Oid. Please check attached patch, any feedback is
>> appreciated.

> This feels like the wrong spot to tackle the issue.

Yeah, that is completely horrid.  It does not remove the existing failure
mode, just changes it to have worse consequences.

regards, tom lane




Re: [PATCH] Infinite loop while acquiring new TOAST Oid

2022-11-28 Thread Andres Freund
Hi,

On 2022-11-28 23:54:53 +0300, Nikita Malakhov wrote:
> We've already encountered this issue on large production databases, and
> 4 billion rows is not so much for modern bases, so this issue already arises
> from time to time and would arise more and more often.

Was the issue that you exceeded 4 billion toasted datums, or that assignment
took a long time? How many toast datums did you actually have? Was this due to
very wide rows leading to even small datums getting toasted?

Greetings,

Andres Freund




Re: [PATCH] Infinite loop while acquiring new TOAST Oid

2022-11-28 Thread Nikita Malakhov
Hi!

We've already encountered this issue on large production databases, and
4 billion rows is not so much for modern bases, so this issue already arises
from time to time and would arise more and more often. I agree that global
oid counter is the main issue, and better solution would be local counters
with larger datatype for value id. This is the right way to solve this
issue,
although it would take some time. As I understand, global counter was taken
because it looked the fastest way of getting unique ID.
Ok, I'll prepare a patch with it.

>Due to that we end up assigning oids that conflict with existing
>toast oids much sooner than 4 billion toasted datums.

Just a note: global oid is checked for related TOAST table only, so equal
oids
in different TOAST tables would not collide.

>Eventually we should do the obvious thing and make toast ids 64bit wide -
to
>combat the space usage we likely should switch to representing the ids as
>variable width integers or such, otherwise the space increase would likely
be
>prohibitive.

I'm already working on it, but I thought that 64-bit value ID won't be
easily
accepted by community. I'd be very thankful for any advice on this.

On Mon, Nov 28, 2022 at 11:36 PM Andres Freund  wrote:

> Hi,
>
> On 2022-11-28 18:34:20 +0300, Nikita Malakhov wrote:
> > While working on Pluggable TOAST we've detected a defective behavior
> > on tables with large amounts of TOASTed data - queries freeze and DB
> stalls.
> > Further investigation led us to the loop with GetNewOidWithIndex function
> > call - when all available Oid already exist in the related TOAST table
> this
> > loop continues infinitely. Data type used for value ID is the UINT32,
> which
> > is
> > unsigned int and has a maximum value of *4294967295* which allows
> > maximum 4294967295 records in the TOAST table. It is not a very big
> amount
> > for modern databases and is the major problem for productive systems.
>
> I don't think the absolute number is the main issue - by default external
> toasting will happen only for bigger datums. 4 billion external datums
> typically use a lot of space.
>
> If you hit this easily with your patch, then you likely broke the
> conditions
> under which external toasting happens.
>
> IMO the big issue is the global oid counter making it much easier to hit
> oid
> wraparound. Due to that we end up assigning oids that conflict with
> existing
> toast oids much sooner than 4 billion toasted datums.
>
> I think the first step to improve the situation is to not use a global oid
> counter for toasted values. One way to do that would be to use the sequence
> code to do oid assignment, but we likely can find a more efficient
> representation.
>
> Eventually we should do the obvious thing and make toast ids 64bit wide -
> to
> combat the space usage we likely should switch to representing the ids as
> variable width integers or such, otherwise the space increase would likely
> be
> prohibitive.
>
>
> > Quick fix for this problem is limiting GetNewOidWithIndex loops to some
> > reasonable amount defined by related macro and returning error if there
> is
> > still no available Oid. Please check attached patch, any feedback is
> > appreciated.
>
> This feels like the wrong spot to tackle the issue. For one, most of the
> looping will be in GetNewOidWithIndex(), so limiting looping in
> toast_save_datum() won't help much. For another, if the limiting were in
> the
> right place, it'd break currently working cases. Due to oid wraparound it's
> pretty easy to hit "ranges" of allocated oids, without even getting close
> to
> 2^32 toasted datums.
>
> Greetings,
>
> Andres Freund
>


-- 
Regards,
Nikita Malakhov
Postgres Professional
https://postgrespro.ru/


Re: [PATCH] Infinite loop while acquiring new TOAST Oid

2022-11-28 Thread Andres Freund
Hi,

On 2022-11-28 18:34:20 +0300, Nikita Malakhov wrote:
> While working on Pluggable TOAST we've detected a defective behavior
> on tables with large amounts of TOASTed data - queries freeze and DB stalls.
> Further investigation led us to the loop with GetNewOidWithIndex function
> call - when all available Oid already exist in the related TOAST table this
> loop continues infinitely. Data type used for value ID is the UINT32, which
> is
> unsigned int and has a maximum value of *4294967295* which allows
> maximum 4294967295 records in the TOAST table. It is not a very big amount
> for modern databases and is the major problem for productive systems.

I don't think the absolute number is the main issue - by default external
toasting will happen only for bigger datums. 4 billion external datums
typically use a lot of space.

If you hit this easily with your patch, then you likely broke the conditions
under which external toasting happens.

IMO the big issue is the global oid counter making it much easier to hit oid
wraparound. Due to that we end up assigning oids that conflict with existing
toast oids much sooner than 4 billion toasted datums.

I think the first step to improve the situation is to not use a global oid
counter for toasted values. One way to do that would be to use the sequence
code to do oid assignment, but we likely can find a more efficient
representation.

Eventually we should do the obvious thing and make toast ids 64bit wide - to
combat the space usage we likely should switch to representing the ids as
variable width integers or such, otherwise the space increase would likely be
prohibitive.


> Quick fix for this problem is limiting GetNewOidWithIndex loops to some
> reasonable amount defined by related macro and returning error if there is
> still no available Oid. Please check attached patch, any feedback is
> appreciated.

This feels like the wrong spot to tackle the issue. For one, most of the
looping will be in GetNewOidWithIndex(), so limiting looping in
toast_save_datum() won't help much. For another, if the limiting were in the
right place, it'd break currently working cases. Due to oid wraparound it's
pretty easy to hit "ranges" of allocated oids, without even getting close to
2^32 toasted datums.

Greetings,

Andres Freund