Re: Improving pg_dump performance when handling large numbers of LOBs

2024-02-05 Thread Shaheed Haque
Might it be worth a modest amount of time using some basic profiling to see
where the time is going? A week is a looonnngg time, even for 150e6
operations. For example, if there an unexpectedly high IO load, some
temporary M.2 storage might help?

On Tue, 6 Feb 2024, 01:36 Ron Johnson,  wrote:

> On Mon, Feb 5, 2024 at 2:01 PM Wyatt Tellis 
> wrote:
>
>> Hi,
>>
>> We've inherited a series of legacy PG 12 clusters that each contain a
>> database that we need to migrate to a PG 15 cluster. Each database contains
>> about 150 million large objects totaling about 250GB.
>>
>
> 250*10^9 / (150*10^6) = 1667 bytes.  That's *tiny*.
>
> Am I misunderstanding you?
>
>>


Re: Improving pg_dump performance when handling large numbers of LOBs

2024-02-05 Thread Adrian Klaver

On 2/5/24 12:32, Ken Marshall wrote:

On Mon, Feb 05, 2024 at 09:17:53PM +0100, Andreas Joseph Krogh wrote:


På mandag 05. februar 2024 kl. 21:10:53, skrev Wyatt Tellis <
wyatt.tel...@gmail.com >:

No, we don't have the ability to make schema changes and the schema in the
PG15 copy needs to match what's in the PG 12 versions

Well then, I guess it boils down to how badly the ones in charge wants this
migration…

--
Andreas Joseph Krogh
CTO / Partner - Visena AS
Mobile: +47 909 56 963
andr...@visena.com 
www.visena.com 
  


Hi,

Can you use logical replication? Then you aren't under the gun to make
a fast copy.


https://www.postgresql.org/docs/current/logical-replication-restrictions.html

"Large objects (see Chapter 35) are not replicated. There is no 
workaround for that, other than storing data in normal tables."





Regards,
Ken




--
Adrian Klaver
adrian.kla...@aklaver.com





Re: Improving pg_dump performance when handling large numbers of LOBs

2024-02-05 Thread Tom Lane
Wyatt Tellis  writes:
> Yes, the LOBs themselves are tiny, but there are a lot of them (~150
> million) which seem to be slowing down pg_dump.  Note, we did not
> design/build this system and agree that use of LOBs for this purpose was
> not necessary.

I don't know of anything much you can do at the user level.  But there
is a patchset under development to improve pg_dump's behavior with
tons of blobs:

https://www.postgresql.org/message-id/842242.1706287...@sss.pgh.pa.us

That's intended for v17, and I'm not sure how tough it might be to
adapt to v15 pg_dump, but maybe you could make that happen.  In any
case, more eyeballs reviewing that patchset would be welcome.

regards, tom lane




Re: Improving pg_dump performance when handling large numbers of LOBs

2024-02-05 Thread Ken Marshall
On Mon, Feb 05, 2024 at 09:17:53PM +0100, Andreas Joseph Krogh wrote:
> 
> På mandag 05. februar 2024 kl. 21:10:53, skrev Wyatt Tellis <
> wyatt.tel...@gmail.com >:
> 
> No, we don't have the ability to make schema changes and the schema in the 
> PG15 copy needs to match what's in the PG 12 versions
> 
> Well then, I guess it boils down to how badly the ones in charge wants this 
> migration…
> 
> --
> Andreas Joseph Krogh
> CTO / Partner - Visena AS
> Mobile: +47 909 56 963
> andr...@visena.com 
> www.visena.com 
>  

Hi,

Can you use logical replication? Then you aren't under the gun to make
a fast copy.

Regards,
Ken




Re: Improving pg_dump performance when handling large numbers of LOBs

2024-02-05 Thread Andreas Joseph Krogh


På mandag 05. februar 2024 kl. 21:10:53, skrev Wyatt Tellis <
wyatt.tel...@gmail.com >:

No, we don't have the ability to make schema changes and the schema in the 
PG15 copy needs to match what's in the PG 12 versions

Well then, I guess it boils down to how badly the ones in charge wants this 
migration…






--
Andreas Joseph Krogh
CTO / Partner - Visena AS
Mobile: +47 909 56 963
andr...@visena.com 
www.visena.com 
 


Re: Improving pg_dump performance when handling large numbers of LOBs

2024-02-05 Thread Ron Johnson
Just checking... and I sympathize with your situation.

On Mon, Feb 5, 2024 at 2:56 PM Wyatt Tellis  wrote:

> Yes, the LOBs themselves are tiny, but there are a lot of them (~150
> million) which seem to be slowing down pg_dump.  Note, we did not
> design/build this system and agree that use of LOBs for this purpose was
> not necessary.
>
> Wyatt
>
> On Mon, Feb 5, 2024 at 11:36 AM Ron Johnson 
> wrote:
>
>> On Mon, Feb 5, 2024 at 2:01 PM Wyatt Tellis 
>> wrote:
>>
>>> Hi,
>>>
>>> We've inherited a series of legacy PG 12 clusters that each contain a
>>> database that we need to migrate to a PG 15 cluster. Each database contains
>>> about 150 million large objects totaling about 250GB.
>>>
>>
>> 250*10^9 / (150*10^6) = 1667 bytes.  That's *tiny*.
>>
>> Am I misunderstanding you?
>>
>>>


Re: Improving pg_dump performance when handling large numbers of LOBs

2024-02-05 Thread Wyatt Tellis
No, we don't have the ability to make schema changes and the schema in the
PG15 copy needs to match what's in the PG 12 versions

Wyatt

On Mon, Feb 5, 2024 at 12:05 PM Andreas Joseph Krogh 
wrote:

> På mandag 05. februar 2024 kl. 20:55:43, skrev Wyatt Tellis <
> wyatt.tel...@gmail.com>:
>
> Yes, the LOBs themselves are tiny, but there are a lot of them (~150
> million) which seem to be slowing down pg_dump.  Note, we did not
> design/build this system and agree that use of LOBs for this purpose was
> not necessary.
>
> Well, the data is there nonetheless, is it an option to convert it to
> bytea before migration?
>
>
> --
> *Andreas Joseph Krogh*
> CTO / Partner - Visena AS
> Mobile: +47 909 56 963
> andr...@visena.com
> www.visena.com
> 
>
>


Re: Improving pg_dump performance when handling large numbers of LOBs

2024-02-05 Thread Andreas Joseph Krogh


På mandag 05. februar 2024 kl. 20:55:43, skrev Wyatt Tellis <
wyatt.tel...@gmail.com >:

Yes, the LOBs themselves are tiny, but there are a lot of them (~150 million) 
which seem to be slowing down pg_dump. Note, we did not design/build this 
system and agree that use of LOBs for this purpose was not necessary.

Well, the data is there nonetheless, is it an option to convert it to bytea 
before migration?






--
Andreas Joseph Krogh
CTO / Partner - Visena AS
Mobile: +47 909 56 963
andr...@visena.com 
www.visena.com 
 


Re: Improving pg_dump performance when handling large numbers of LOBs

2024-02-05 Thread Wyatt Tellis
Yes, the LOBs themselves are tiny, but there are a lot of them (~150
million) which seem to be slowing down pg_dump.  Note, we did not
design/build this system and agree that use of LOBs for this purpose was
not necessary.

Wyatt

On Mon, Feb 5, 2024 at 11:36 AM Ron Johnson  wrote:

> On Mon, Feb 5, 2024 at 2:01 PM Wyatt Tellis 
> wrote:
>
>> Hi,
>>
>> We've inherited a series of legacy PG 12 clusters that each contain a
>> database that we need to migrate to a PG 15 cluster. Each database contains
>> about 150 million large objects totaling about 250GB.
>>
>
> 250*10^9 / (150*10^6) = 1667 bytes.  That's *tiny*.
>
> Am I misunderstanding you?
>
>>


Re: Improving pg_dump performance when handling large numbers of LOBs

2024-02-05 Thread Adrian Klaver


On 2/5/24 11:35 AM, Ron Johnson wrote:
On Mon, Feb 5, 2024 at 2:01 PM Wyatt Tellis  
wrote:


Hi,

We've inherited a series of legacy PG 12 clusters that each
contain a database that we need to migrate to a PG 15 cluster.
Each database contains about 150 million large objects totaling
about 250GB.


250*10^9 / (150*10^6) = 1667 bytes.  That's *tiny*.

Am I misunderstanding you?



I think it less about the size of the individual objects then the number 
(150 million) of them.


AFAIK that can't be handled by COPY, therefore they have to be 
transferred individually.


--
Adrian Klaver
adrian.kla...@aklaver.com


Re: Improving pg_dump performance when handling large numbers of LOBs

2024-02-05 Thread Ron Johnson
On Mon, Feb 5, 2024 at 2:01 PM Wyatt Tellis  wrote:

> Hi,
>
> We've inherited a series of legacy PG 12 clusters that each contain a
> database that we need to migrate to a PG 15 cluster. Each database contains
> about 150 million large objects totaling about 250GB.
>

250*10^9 / (150*10^6) = 1667 bytes.  That's *tiny*.

Am I misunderstanding you?

>


Improving pg_dump performance when handling large numbers of LOBs

2024-02-05 Thread Wyatt Tellis
Hi,

We've inherited a series of legacy PG 12 clusters that each contain a
database that we need to migrate to a PG 15 cluster. Each database contains
about 150 million large objects totaling about 250GB. When using pg_dump
we've found that it takes a couple of weeks to dump out this much data.
We've tried using the jobs option with the directory format but that seems
to save each LOB separately which makes moving the resulting dump to
another location unwieldy.  Has anyone else had to deal with dumping a
database with these many LOBs?  Are there any suggestions for how to
improve performance?

Thanks,

Wyatt