Re: Gradual migration from integer to bigint?

2023-10-06 Thread Bruce Momjian
On Sun, Oct  1, 2023 at 05:30:39AM -0400, Ann Harrison wrote:
> Other databases do allow that sort of gradual migration.  One example
> has an internal table of record descriptions indexed the table identifier 
> and a description number.  Each record includes a header with various 
> useful bits including its description number. When reading a record, 
> the system notes the description number and looks up the description 
> before parsing the record into columns.  
> 
> The transition is made easier if the database indexes are generic - 
> for example, numbers rather than decimal[12,6], int32, etc., and string 
> rather than varchar[12].   That way, increasing a column size doesn't
> require re-indexing.
> 
> But, those are decision that really had to be made early - making
> a major format change 25+ years in would break too much.

And the performance sounds terrible.  ;-)

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Only you can decide what is important to you.




Re: Gradual migration from integer to bigint?

2023-10-05 Thread Nick Cleaton
On Sat, 30 Sept 2023, 23:37 Tom Lane,  wrote:

>
> I think what you're asking for is a scheme whereby some rows in a
> table have datatype X in a particular column while other rows in
> the very same physical table have datatype Y in the same column.
>

An alternative for NOT NULL columns would be to use a new attnum for the
bigint version of the id, but add a column to pg_attribute allowing linking
the new id col to the dropped old id col, to avoid the table rewrite.

Global read code change needed: on finding a NULL in a NOT NULL column,
check for a link to a dropped old col and use that value instead if found.
The check could be almost free in the normal case if there's already a
check for unexpected NULL or tuple too short.

Then a metadata-only operation can create the new id col and drop and
rename and link the old id col, and fix up fkeys etc for the attnum change.

Indexes are an issue. Require the in-advance creation of indexes like
btree(id::bigint) mirroring every index involving id maybe ? Those could
then be swapped in as part of the same metadata operation.


Re: Gradual migration from integer to bigint?

2023-10-01 Thread Ron

On 10/1/23 12:04, Ireneusz Pluta wrote:

W dniu 30.09.2023 o 07:55, James Healy pisze:

...
We shouldn't have let them get so big, but that's a conversation
for another day.

Some are approaching overflow and we're slowly doing the work to
migrate to bigint. Mostly via the well understood "add a new id_bigint
column, populate on new tuples, backfill the old, switch the PK"
method. The backfill is slow on these large tables, but it works and
there's plenty of blog posts and documentation to follow.
wouldn't wrapping to negative numbers like: 
https://www.youtube.com/watch?v=XYRgTazYuZ4=1338s be a solution for you? 
At least for buying more time for the slow migration process. Or even as a 
definite solution if you now take care and not let the keys grow too quickly.


The application might not react well to negative numbers.

--
Born in Arizona, moved to Babylonia.




Re: Gradual migration from integer to bigint?

2023-10-01 Thread Ann Harrison
On Sat, Sep 30, 2023 at 11:37 PM Tom Lane  wrote:

> James Healy  writes:
> > However it doesn't really address the question of a gradual migration
> > process that can read 32bit ints but insert/update as 64bit bigints. I
> > remain curious about whether the postgres architecture just makes that
> > implausible, or if it could be done and just hasn't because the
> > options for a more manual migration are Good Enough.
>
> I think what you're asking for is a scheme whereby some rows in a
> table have datatype X in a particular column while other rows in
> the very same physical table have datatype Y in the same column.
> That is not happening, because there'd be no way to tell which
> case applies to any particular row.
>

Other databases do allow that sort of gradual migration.  One example
has an internal table of record descriptions indexed the table identifier
and a description number.  Each record includes a header with various
useful bits including its description number. When reading a record,
the system notes the description number and looks up the description
before parsing the record into columns.

The transition is made easier if the database indexes are generic -
for example, numbers rather than decimal[12,6], int32, etc., and string
rather than varchar[12].   That way, increasing a column size doesn't
require re-indexing.

But, those are decision that really had to be made early - making
a major format change 25+ years in would break too much.

Cheers,

Ann

>
>
>
>
>


Re: Gradual migration from integer to bigint?

2023-10-01 Thread Ireneusz Pluta

W dniu 30.09.2023 o 07:55, James Healy pisze:

...
We shouldn't have let them get so big, but that's a conversation
for another day.

Some are approaching overflow and we're slowly doing the work to
migrate to bigint. Mostly via the well understood "add a new id_bigint
column, populate on new tuples, backfill the old, switch the PK"
method. The backfill is slow on these large tables, but it works and
there's plenty of blog posts and documentation to follow.
wouldn't wrapping to negative numbers like: https://www.youtube.com/watch?v=XYRgTazYuZ4=1338s be a 
solution for you? At least for buying more time for the slow migration process. Or even as a 
definite solution if you now take care and not let the keys grow too quickly.





Re: Gradual migration from integer to bigint?

2023-10-01 Thread James Healy
On Sun, 1 Oct 2023 at 14:37, Tom Lane  wrote:
> I think what you're asking for is a scheme whereby some rows in a
> table have datatype X in a particular column while other rows in
> the very same physical table have datatype Y in the same column.
> That is not happening, because there'd be no way to tell which
> case applies to any particular row.

To be honest, I don't know enough about the postgresql on-disk format
and tuple shape to be confident in how this would be solved. I was
thinking more about the ergonomics of what would be helpful and
wondering how viable it was.

Sounds like not very viable. Rats.

The docs [1] on changing column types include:

> As an exception, when changing the type of an existing column, if the USING 
> clause does not change the column contents and the old type is either binary 
> coercible to the new type or an unconstrained domain over the new type, a 
> table rewrite is not needed

... and mention the specific case of switching between VARCHAR and
TEXT not requiring a table or index rewrite.

Seems like the specific case of int->bigint is impossible to make as
easy, given the fixed sizes in the tuple and impossibility of knowing
from tuple to tuple whether to read 4 or 8 bytes.

regards,
James

[1] https://www.postgresql.org/docs/current/sql-altertable.html




Re: Gradual migration from integer to bigint?

2023-09-30 Thread Ron

On 9/30/23 22:37, Tom Lane wrote:
[snip]

especially not a break that adds more per-row overhead.

So really the only way forward for this would be to provide more
automation for the existing conversion processes involving table
rewrites.


When altering an unindexed INT to BIGINT, do all of the indices get rewritten?

--
Born in Arizona, moved to Babylonia.




Re: Gradual migration from integer to bigint?

2023-09-30 Thread Tom Lane
James Healy  writes:
> However it doesn't really address the question of a gradual migration
> process that can read 32bit ints but insert/update as 64bit bigints. I
> remain curious about whether the postgres architecture just makes that
> implausible, or if it could be done and just hasn't because the
> options for a more manual migration are Good Enough.

I think what you're asking for is a scheme whereby some rows in a
table have datatype X in a particular column while other rows in
the very same physical table have datatype Y in the same column.
That is not happening, because there'd be no way to tell which
case applies to any particular row.

You could fantasize about labeling individual rows somehow, but
it's mere fantasy because there's noplace to put such labels.
To the limited extent that we can find spare space in the
existing page layout, there are far better use-cases (see
nearby discussions about 64-bit XIDs, for example).  And nobody
is going to advocate breaking on-disk compatibility for this,
especially not a break that adds more per-row overhead.

So really the only way forward for this would be to provide more
automation for the existing conversion processes involving table
rewrites.  That's possible perhaps, but it doesn't really sound
compelling enough to justify a lot of work.

regards, tom lane




Re: Gradual migration from integer to bigint?

2023-09-30 Thread James Healy
On Sun, 1 Oct 2023 at 04:35, Bruce Momjian  wrote:
> I think this talk will help you:
>
> https://www.youtube.com/watch?v=XYRgTazYuZ4

Thanks, I hadn't seen that talk and it's a good summary of the issue
and available solutions.

However it doesn't really address the question of a gradual migration
process that can read 32bit ints but insert/update as 64bit bigints. I
remain curious about whether the postgres architecture just makes that
implausible, or if it could be done and just hasn't because the
options for a more manual migration are Good Enough.

James




Re: Gradual migration from integer to bigint?

2023-09-30 Thread Bruce Momjian
On Sat, Sep 30, 2023 at 03:55:20PM +1000, James Healy wrote:
> My organization has a number of very large tables (most 100s of GB, a
> couple over a Tb) that were created many years ago by a tool that
> defaulted to integer PKs rather than bigint. Those PKs have a number
> of integer FKs in related tables as well. We shouldn't have let them
> get so big, but that's a conversation for another day.
> 
> Some are approaching overflow and we're slowly doing the work to
> migrate to bigint. Mostly via the well understood "add a new id_bigint
> column, populate on new tuples, backfill the old, switch the PK"
> method. The backfill is slow on these large tables, but it works and
> there's plenty of blog posts and documentation to follow.
> 
> It did make me curious though: would it be possible for postgres to
> support gradual migration from integer to bigint in a more transparent
> way, where new and updated tuples are written as bigint, but existing
> tuples can be read as integer?
> 
> I assume maybe a complication is that the catalog says the column is
> either 32bit int or 64bit bigint and making that conditional is hard.
> There's presumably other considerations I'm unaware of too. My core
> question: are there significant technical blockers to supporting this
> kind of gradual in place migration, or has it just not been enough of
> a problem that it's received attention?

I think this talk will help you:

https://www.youtube.com/watch?v=XYRgTazYuZ4

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Only you can decide what is important to you.




Re: Gradual migration from integer to bigint?

2023-09-30 Thread grimy . outshine830
On Sat, Sep 30, 2023 at 03:55:20PM +1000, James Healy wrote:

> It did make me curious though: would it be possible for postgres to
> support gradual migration from integer to bigint in a more
> transparent way, where new and updated tuples are written as bigint,
> but existing tuples can be read as integer?

Language police: this is the *opposite* of "transparent". "trasparent"
and "automated" are not synonyms.

-- 
Ian