Re: Insert with Jsonb column hangs

2024-03-09 Thread Rob Sargent


> On Mar 9, 2024, at 9:01 AM, kuldeep singh  wrote:
> 
> 
> Copy may not work in our scenario since we need to join data from multiple 
> tables & then  convert it to json using  row_to_json . This json data 
> eventually  needs to be stored in a target table . 
>> 
Wait. You're getting the data from the database, converting that into json and 
writing that back to the database?  Can you not feed the json consumer from 
those sources?



Re: creating a subset DB efficiently ?

2024-03-09 Thread Peter J. Holzer
On 2024-03-08 11:22:17 -0500, David Gauthier wrote:
> Related question...
> The "delete from par_tbl_a where project <> 'a' " is taking forever.

Have you checked whether all the FK columns have an index? Otherwise
checking for references means a full table scan which may take a long
time.

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature


Re: Help diagnosing replication (copy) error

2024-03-09 Thread Steve Baldwin
On Sun, Mar 10, 2024 at 3:16 AM Adrian Klaver 
wrote:

>
> Is the case still active?
> Can you get information from them about what they saw?
>
> I've re-opened the case and asked for an explanation of the error and what
they did to resolve it.

Hopefully they shed some light on the 'mystery'.

Steve


Query plans not identical for `id in(1,2,3)` and `(id=1 or id=2 or id=3)`

2024-03-09 Thread Ilya Basin
Hi List.

I have a list of bigint keys and I need to retrieve rows by these keys. 
Normally, I would split this list into pages of size 900 and perform several 
`SELECT ... WHERE key in($1,$2,...)`. However, the proprietary ORM we use can 
only produce this SQL: `SELECT ... WHERE (key=$1 or key=$2 or ...)`. 
Surprisingly, PostgreSQL planner treats these two SQLs differently:

- ```select * FROM "audittrail$referencelogline" where id in ( 1 , 2 , 3 )```

Index Scan
https://i.stack.imgur.com/dr8oz.png

- ```select * FROM "audittrail$referencelogline" where id = 1 or id = 2 or id = 
3```

A lot of "Bitmap Index Scan" for each value 
https://i.stack.imgur.com/dnErs.png

Is it possible to configure PostgreSQL 12.16 to treat the second query as the 
first?





Re: Seeing high query planning time on Azure Postgres Single Server version 11.

2024-03-09 Thread hassan rafi
Thanks,

Would upgrading to the latest version of Postgres potentially solve the
issue?

On Sat, Mar 9, 2024 at 11:30 PM Tom Lane  wrote:

> hassan rafi  writes:
> > The issue of high query planning time seems to intermittently resolve
> > itself, only to reoccur after a few hours.
>
> I wonder if you are running into the lack of this fix:
>
> Author: Tom Lane 
> Branch: master Release: REL_16_BR [9c6ad5eaa] 2022-11-22 14:40:20 -0500
> Branch: REL_15_STABLE Release: REL_15_2 [2debceed2] 2022-11-22 14:40:44
> -0500
> Branch: REL_14_STABLE Release: REL_14_7 [bd06fe4de] 2022-11-22 14:40:45
> -0500
> Branch: REL_13_STABLE Release: REL_13_10 [6e639267a] 2022-11-22 14:40:45
> -0500
> Branch: REL_12_STABLE Release: REL_12_14 [ec10b6139] 2022-11-22 14:40:45
> -0500
> Branch: REL_11_STABLE Release: REL_11_19 [b96a096db] 2022-11-22 14:40:46
> -0500
>
> YA attempt at taming worst-case behavior of get_actual_variable_range.
>
> We've made multiple attempts at preventing get_actual_variable_range
> from taking an unreasonable amount of time (3ca930fc3, fccebe421).
> But there's still an issue for the very first planning attempt after
> deletion of a large number of extremal-valued tuples.  While that
> planning attempt will set "killed" bits on the tuples it visits and
> thereby reduce effort for next time, there's still a lot of work it
> has to do to visit the heap and then set those bits.  It's (usually?)
> not worth it to do that much work at plan time to have a slightly
> better estimate, especially in a context like this where the table
> contents are known to be mutating rapidly.
>
> Therefore, let's bound the amount of work to be done by giving up
> after we've visited 100 heap pages.  Giving up just means we'll
> fall back on the extremal value recorded in pg_statistic, so it
> shouldn't mean that planner estimates suddenly become worthless.
>
> Note that this means we'll still gradually whittle down the problem
> by setting a few more index "killed" bits in each planning attempt;
> so eventually we'll reach a good state (barring further deletions),
> even in the absence of VACUUM.
>
> Simon Riggs, per a complaint from Jakub Wartak (with cosmetic
> adjustments by me).  Back-patch to all supported branches.
>
> Discussion:
> https://postgr.es/m/CAKZiRmznOwi0oaV=4PHOCM4ygcH4MgSvt8=5cu_vncfc8fs...@mail.gmail.com
>
> As noted, that did make it into the 11.x branch, but not till 11.19.
>
> regards, tom lane
>


Re: creating a subset DB efficiently ?

2024-03-09 Thread Shaheed Haque
I go more or less the other way. I have a script which follows the
child...parent...grandparent...etc foreign keys in the source database and
dumps only the records belonging to the selected "project" (your
terminology, in my case it is "client"). I save the dumped data to an
archive file.

The script has hardcoded knowledge only about the absolute minimum number
of the root ancestor tables (and certain special cases mostly relating to
the fact I'm talking about a Django ORM-managed schema) and otherwise tries
to avoid knowing too much so that the schema can evolve mostly without
bothering the script.

The script includes meta data about the dump in a "manifest". The manifest
records, amongst other things, the order in which the tables were dumped.
The restore operation uses this to correctly order the table restores.

I can then simply restore the archive, table by table, using the individual
dumps using a script which walks the manifest.






On Sat, 9 Mar 2024, 14:56 hector vass,  wrote:

> On Fri, Mar 8, 2024 at 4:22 PM David Gauthier 
> wrote:
>
>> Here's the situation
>>
>> - The DB contains data for several projects.
>> - The tables of the DB contain data for all projects (data is not
>> partitioned on project name or anything like that)
>> - The "project" identifier (table column) exists in a few "parent" tables
>> with many child... grandchild,... tables under them connected with foreign
>> keys defined with "on delete cascade".  So if a record in one of the parent
>> table records is deleted, all of its underlying, dependent records get
>> deleted too.
>> - New projects come in, and old ones need to be removed and "archived" in
>> DBs of their own.  So there's a DB called "active_projects" and there's a
>> DB called "project_a_archive" (identical metadata).
>> - The idea is to copy the data for project "a" that's in
>> "active_projects" to the "project_a_arhchive" DB AND delete the project a
>> data out of "active_projects".
>> - Leave "project_a_archive" up and running if someone needs to attach to
>> that and get some old/archived data.
>>
>> The brute-force method I've been using is...
>> 1)  pg_dump "active_projects" to a (huge) file then populate
>> "project_a_archive" using that (I don't have the privs to create database,
>> IT creates an empty one for me, so this is how I do it).
>> 2) go into the "project_a_archive" DB and run... "delete from par_tbl_1
>> where project <> 'a' ", "delete from par_tbl_2 where project <> 'a' ",
>> etc... leaving only project "a" data in the DB.
>> 3) go into the "active_projects" DB and "delete from par_tbl_1 where
>> project = 'a' ", etc... removing project "a" from the "active_projects DB.
>>
>> Ya, not very elegant, it takes a long time and it takes a lot of
>> resources.  So I'm looking for ideas on how to do this better.
>>
>> Related question...
>> The "delete from par_tbl_a where project <> 'a' " is taking forever.  I
>> fear it's because it's trying to journal everything in case I want to
>> rollback.  But this is just in the archive DB and I don't mind taking the
>> risk if I can speed this up outside of a transaction.  How can I run a
>> delete command like this without the rollback recovery overhead ?
>>
>
>
> >(I don't have the privs to create database, IT creates an empty one for
> me, so this is how I do it).
>
> That's a shame.  You can do something similar with tablespaces
>   Template your existing schema to create a new schema for the project
> (pg_dump -s)
>   Create tablespace for this new project and schema
>
>  You can then move the physical tablespace to cheaper disk and use
> symbolic links or... archive and/or back it up at the schema level with
> pg_dump -n
>
> ...as long as you don't put anything in the public schema all you are
> really sharing is roles otherwise a bit like a separate database
>
>
>
>
>


Re: Seeing high query planning time on Azure Postgres Single Server version 11.

2024-03-09 Thread Tom Lane
hassan rafi  writes:
> The issue of high query planning time seems to intermittently resolve
> itself, only to reoccur after a few hours.

I wonder if you are running into the lack of this fix:

Author: Tom Lane 
Branch: master Release: REL_16_BR [9c6ad5eaa] 2022-11-22 14:40:20 -0500
Branch: REL_15_STABLE Release: REL_15_2 [2debceed2] 2022-11-22 14:40:44 -0500
Branch: REL_14_STABLE Release: REL_14_7 [bd06fe4de] 2022-11-22 14:40:45 -0500
Branch: REL_13_STABLE Release: REL_13_10 [6e639267a] 2022-11-22 14:40:45 -0500
Branch: REL_12_STABLE Release: REL_12_14 [ec10b6139] 2022-11-22 14:40:45 -0500
Branch: REL_11_STABLE Release: REL_11_19 [b96a096db] 2022-11-22 14:40:46 -0500

YA attempt at taming worst-case behavior of get_actual_variable_range.

We've made multiple attempts at preventing get_actual_variable_range
from taking an unreasonable amount of time (3ca930fc3, fccebe421).
But there's still an issue for the very first planning attempt after
deletion of a large number of extremal-valued tuples.  While that
planning attempt will set "killed" bits on the tuples it visits and
thereby reduce effort for next time, there's still a lot of work it
has to do to visit the heap and then set those bits.  It's (usually?)
not worth it to do that much work at plan time to have a slightly
better estimate, especially in a context like this where the table
contents are known to be mutating rapidly.

Therefore, let's bound the amount of work to be done by giving up
after we've visited 100 heap pages.  Giving up just means we'll
fall back on the extremal value recorded in pg_statistic, so it
shouldn't mean that planner estimates suddenly become worthless.

Note that this means we'll still gradually whittle down the problem
by setting a few more index "killed" bits in each planning attempt;
so eventually we'll reach a good state (barring further deletions),
even in the absence of VACUUM.

Simon Riggs, per a complaint from Jakub Wartak (with cosmetic
adjustments by me).  Back-patch to all supported branches.

Discussion: 
https://postgr.es/m/CAKZiRmznOwi0oaV=4PHOCM4ygcH4MgSvt8=5cu_vncfc8fs...@mail.gmail.com

As noted, that did make it into the 11.x branch, but not till 11.19.

regards, tom lane




Re: creating a subset DB efficiently ?

2024-03-09 Thread Adrian Klaver

On 3/8/24 08:22, David Gauthier wrote:

Here's the situation

- The DB contains data for several projects.
- The tables of the DB contain data for all projects (data is not 
partitioned on project name or anything like that)
- The "project" identifier (table column) exists in a few "parent" 
tables with many child... grandchild,... tables under them connected 
with foreign keys defined with "on delete cascade".  So if a record in 
one of the parent table records is deleted, all of its underlying, 
dependent records get deleted too.


How many "... child... grandchild,... tables" ?

Do these tables constitute all the tables in the database?



Related question...
The "delete from par_tbl_a where project <> 'a' " is taking forever.  I 
fear it's because it's trying to journal everything in case I want to 
rollback.  But this is just in the archive DB and I don't mind taking 
the risk if I can speed this up outside of a transaction.  How can I run 
a delete command like this without the rollback recovery overhead ?


I am assuming that at the point you do  "delete from par_tbl_a where 
project <> 'a' " project a is no longer receiving data and its records 
are static. Further assuming there is a PK that you could order by, then 
it would seem the way to go would be to delete in batches as determined 
by the PK.



--
Adrian Klaver
adrian.kla...@aklaver.com





Re: Insert with Jsonb column hangs

2024-03-09 Thread kuldeep singh
Thanks Adrian & Hector .

I will try the copy approach & post the result here.

On Sat, Mar 9, 2024 at 9:57 PM hector vass  wrote:

>
>
> On Sat, Mar 9, 2024 at 4:10 PM Adrian Klaver 
> wrote:
>
>> On 3/9/24 08:00, kuldeep singh wrote:
>> > Copy may not work in our scenario since we need to join data from
>> > multiple tables & then  convert it to json using  row_to_json . This
>> > json data eventually  needs to be stored in a target table .
>>
>> Per:
>>
>> https://www.postgresql.org/docs/current/sql-copy.html
>>
>> "
>> COPY { table_name [ ( column_name [, ...] ) ] | ( query ) }
>>
>> <...>
>>
>> query
>>
>>  A SELECT, VALUES, INSERT, UPDATE, or DELETE command whose results
>> are to be copied. Note that parentheses are required around the query.
>>
>>  For INSERT, UPDATE and DELETE queries a RETURNING clause must be
>> provided, and the target relation must not have a conditional rule, nor
>> an ALSO rule, nor an INSTEAD rule that expands to multiple statements.
>> "
>>
>> >
>> > Will it be better if we break the process into batches of like 10,000
>> > rows & insert the data in its individual transactions? Or any other
>> > better solution available ?
>> >
>> > On Sat, Mar 9, 2024 at 9:01 PM hector vass > > > wrote:
>> >
>> >
>> >
>> > On Sat, Mar 9, 2024 at 3:02 PM kuldeep singh
>> > mailto:kuldeeparor...@gmail.com>> wrote:
>> >
>> > Hi,
>> >
>> > We are inserting data close to 1M record & having a single Jsonb
>> > column but query is getting stuck.
>> >
>> > We are using insert into select * .. , so all the operations are
>> > within the DB.
>> >
>> > If we are running select query individually then it is returning
>> > the data in 40 sec for all rows but with insert it is getting
>> stuck.
>> >
>> > PG Version - 15.
>> >
>> > What could be the problem here ?
>> >
>> > Regards,
>> > KD
>> >
>> >
>> > insert 1M rows especially JSON that can be large, variable in size
>> > and stored as blobs and indexed is not perhaps the correct way to do
>> > this
>> > insert performance will also depend on your tuning.  Supporting
>> > transactions, users or bulk processing are 3x sides of a compromise.
>> > you should perhaps consider that insert is for inserting a few rows
>> > into live tables ... you might be better using copy or \copy,
>> > pg_dump if you are just trying to replicate a large table
>> >
>>
>> --
>> Adrian Klaver
>> adrian.kla...@aklaver.com
>
>
> What Adrian Klaver said ^
> discovered even this works...
>
>
> create view myview as (select row_to_json from mytable);
>
> create table newtable as select * from myview where 1=0;
>
> copy myview to program 'psql mydb postgres -c ''copy newtable from stdin''
> ';
>
>
>
>
>


Re: Insert with Jsonb column hangs

2024-03-09 Thread hector vass
On Sat, Mar 9, 2024 at 4:10 PM Adrian Klaver 
wrote:

> On 3/9/24 08:00, kuldeep singh wrote:
> > Copy may not work in our scenario since we need to join data from
> > multiple tables & then  convert it to json using  row_to_json . This
> > json data eventually  needs to be stored in a target table .
>
> Per:
>
> https://www.postgresql.org/docs/current/sql-copy.html
>
> "
> COPY { table_name [ ( column_name [, ...] ) ] | ( query ) }
>
> <...>
>
> query
>
>  A SELECT, VALUES, INSERT, UPDATE, or DELETE command whose results
> are to be copied. Note that parentheses are required around the query.
>
>  For INSERT, UPDATE and DELETE queries a RETURNING clause must be
> provided, and the target relation must not have a conditional rule, nor
> an ALSO rule, nor an INSTEAD rule that expands to multiple statements.
> "
>
> >
> > Will it be better if we break the process into batches of like 10,000
> > rows & insert the data in its individual transactions? Or any other
> > better solution available ?
> >
> > On Sat, Mar 9, 2024 at 9:01 PM hector vass  > > wrote:
> >
> >
> >
> > On Sat, Mar 9, 2024 at 3:02 PM kuldeep singh
> > mailto:kuldeeparor...@gmail.com>> wrote:
> >
> > Hi,
> >
> > We are inserting data close to 1M record & having a single Jsonb
> > column but query is getting stuck.
> >
> > We are using insert into select * .. , so all the operations are
> > within the DB.
> >
> > If we are running select query individually then it is returning
> > the data in 40 sec for all rows but with insert it is getting
> stuck.
> >
> > PG Version - 15.
> >
> > What could be the problem here ?
> >
> > Regards,
> > KD
> >
> >
> > insert 1M rows especially JSON that can be large, variable in size
> > and stored as blobs and indexed is not perhaps the correct way to do
> > this
> > insert performance will also depend on your tuning.  Supporting
> > transactions, users or bulk processing are 3x sides of a compromise.
> > you should perhaps consider that insert is for inserting a few rows
> > into live tables ... you might be better using copy or \copy,
> > pg_dump if you are just trying to replicate a large table
> >
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com


What Adrian Klaver said ^
discovered even this works...


create view myview as (select row_to_json from mytable);

create table newtable as select * from myview where 1=0;

copy myview to program 'psql mydb postgres -c ''copy newtable from stdin'' '
;


Re: Help diagnosing replication (copy) error

2024-03-09 Thread Adrian Klaver

On 3/8/24 22:26, Steve Baldwin wrote:



On Sat, Mar 9, 2024 at 11:06 AM Jeff Ross > wrote:



RDS is a black box--who knows what's really going on there?  It
would be
interesting to see what the response is after you open a support case.
I hope you'll be able to share that with the list.

This is very mysterious. I logged the case, and before it had been 
picked up by an analyst, the issue somehow resolved itself without me 
doing anything.


Is the case still active?
Can you get information from them about what they saw?

I am not big believer in coincidences, that you reported a problem and 
then the problem disappeared.




I now have 418M+ rows in the table that it got stuck on.

:shrug:

Thanks Adrian and Jeff for responding.

Steve

Jeff








--
Adrian Klaver
adrian.kla...@aklaver.com





Re: Insert with Jsonb column hangs

2024-03-09 Thread hector vass
copy syntax can include any valid select statement

  COPY (any valid select statement joining tables and converting it
row_to_json) TO 'some_dump_file'

or can copy a view
  CREATE VIEW myview (any valid select statement joining tables and
converting it row_to_json);
  COPY myview TO 'some_dump_file'


Regards
Hector Vass
07773 352559


On Sat, Mar 9, 2024 at 4:01 PM kuldeep singh 
wrote:

> Copy may not work in our scenario since we need to join data from multiple
> tables & then  convert it to json using  row_to_json . This json data
> eventually  needs to be stored in a target table .
>
> Will it be better if we break the process into batches of like 10,000 rows
> & insert the data in its individual transactions? Or any other better
> solution available ?
>
> On Sat, Mar 9, 2024 at 9:01 PM hector vass  wrote:
>
>>
>>
>> On Sat, Mar 9, 2024 at 3:02 PM kuldeep singh 
>> wrote:
>>
>>> Hi,
>>>
>>> We are inserting data close to 1M record & having a single Jsonb column
>>> but query is getting stuck.
>>>
>>> We are using insert into select * .. , so all the operations are within
>>> the DB.
>>>
>>> If we are running select query individually then it is returning the
>>> data in 40 sec for all rows but with insert it is getting stuck.
>>>
>>> PG Version - 15.
>>>
>>> What could be the problem here ?
>>>
>>> Regards,
>>> KD
>>>
>>
>> insert 1M rows especially JSON that can be large, variable in size and
>> stored as blobs and indexed is not perhaps the correct way to do this
>> insert performance will also depend on your tuning.  Supporting
>> transactions, users or bulk processing are 3x sides of a compromise.
>> you should perhaps consider that insert is for inserting a few rows into
>> live tables ... you might be better using copy or \copy, pg_dump if you are
>> just trying to replicate a large table
>>
>>


Re: Insert with Jsonb column hangs

2024-03-09 Thread Adrian Klaver

On 3/9/24 08:00, kuldeep singh wrote:
Copy may not work in our scenario since we need to join data from 
multiple tables & then  convert it to json using  row_to_json . This 
json data eventually  needs to be stored in a target table .


Per:

https://www.postgresql.org/docs/current/sql-copy.html

"
COPY { table_name [ ( column_name [, ...] ) ] | ( query ) }

<...>

query

A SELECT, VALUES, INSERT, UPDATE, or DELETE command whose results 
are to be copied. Note that parentheses are required around the query.


For INSERT, UPDATE and DELETE queries a RETURNING clause must be 
provided, and the target relation must not have a conditional rule, nor 
an ALSO rule, nor an INSTEAD rule that expands to multiple statements.

"



Will it be better if we break the process into batches of like 10,000 
rows & insert the data in its individual transactions? Or any other 
better solution available ?


On Sat, Mar 9, 2024 at 9:01 PM hector vass > wrote:




On Sat, Mar 9, 2024 at 3:02 PM kuldeep singh
mailto:kuldeeparor...@gmail.com>> wrote:

Hi,

We are inserting data close to 1M record & having a single Jsonb
column but query is getting stuck.

We are using insert into select * .. , so all the operations are
within the DB.

If we are running select query individually then it is returning
the data in 40 sec for all rows but with insert it is getting stuck.

PG Version - 15.

What could be the problem here ?

Regards,
KD


insert 1M rows especially JSON that can be large, variable in size
and stored as blobs and indexed is not perhaps the correct way to do
this
insert performance will also depend on your tuning.  Supporting
transactions, users or bulk processing are 3x sides of a compromise.
you should perhaps consider that insert is for inserting a few rows
into live tables ... you might be better using copy or \copy,
pg_dump if you are just trying to replicate a large table



--
Adrian Klaver
adrian.kla...@aklaver.com





Re: Insert with Jsonb column hangs

2024-03-09 Thread kuldeep singh
Copy may not work in our scenario since we need to join data from multiple
tables & then  convert it to json using  row_to_json . This json data
eventually  needs to be stored in a target table .

Will it be better if we break the process into batches of like 10,000 rows
& insert the data in its individual transactions? Or any other better
solution available ?

On Sat, Mar 9, 2024 at 9:01 PM hector vass  wrote:

>
>
> On Sat, Mar 9, 2024 at 3:02 PM kuldeep singh 
> wrote:
>
>> Hi,
>>
>> We are inserting data close to 1M record & having a single Jsonb column
>> but query is getting stuck.
>>
>> We are using insert into select * .. , so all the operations are within
>> the DB.
>>
>> If we are running select query individually then it is returning the data
>> in 40 sec for all rows but with insert it is getting stuck.
>>
>> PG Version - 15.
>>
>> What could be the problem here ?
>>
>> Regards,
>> KD
>>
>
> insert 1M rows especially JSON that can be large, variable in size and
> stored as blobs and indexed is not perhaps the correct way to do this
> insert performance will also depend on your tuning.  Supporting
> transactions, users or bulk processing are 3x sides of a compromise.
> you should perhaps consider that insert is for inserting a few rows into
> live tables ... you might be better using copy or \copy, pg_dump if you are
> just trying to replicate a large table
>
>


Re: Seeing high query planning time on Azure Postgres Single Server version 11.

2024-03-09 Thread hassan rafi
Sure, we will plan to upgrade to the latest version.

schemaname|relname |n_tup_ins|n_tup_upd
|n_tup_del|n_live_tup|n_dead_tup|last_vacuum|last_autovacuum  |
--++-+--+-+--+--+---+-+
public|store_seller_products   | 14865951|4584489857|472310200|
845718108|1982033749|   |2024-02-29 01:08:00.000 +0530|
public|products_inventory_delta| 74889247|1214920129| 74703893|
253783107|  17060377|   |2024-03-09 12:53:00.000 +0530|

The value of default_statistics_target is set to 100.

The issue of high query planning time seems to intermittently resolve
itself, only to reoccur after a few hours.

On Sat, Mar 9, 2024 at 7:57 PM Robert Treat  wrote:

> It'd be worth checking that your default_statistics_target isn't set
> to anything wild, but beyond that, it'd be interesting to look at the
> output of vacuum verbose on some of the system catalogs as istm you
> might have catalog bloat.
>
> I should also mention that you're running a non-longer-supported
> version of Postgres (v11) and not even the latest release of said EOL
> version. And if I am not mistaken, "Azure Postgres single server
> version" is also deprecated, so you should really focus on getting
> upgraded to something more modern.
>
> Robert Treat
> https://xzilla.net
>
> On Sat, Mar 9, 2024 at 8:12 AM hassan rafi 
> wrote:
> >
> > Postgres version: PostgreSQL 11.18, compiled by Visual C++ build 1800,
> 64-bit
> > relname
> |relpages|reltuples|relallvisible|relkind|relnatts|relhassubclass|reloptions|pg_table_size|
> >
> -++-+-+---++--+--+-+
> > store_seller_products|16007942|843460096|   797033|r  |
> 16|false |NULL  | 131980795904|
> >
> >
> > relname
>  
> |relpages|reltuples|relallvisible|relkind|relnatts|relhassubclass|reloptions|pg_table_size|
> >
> ++-+-+---++--+--+-+
> > products_inventory_delta| 2847202|259351648|  1606201|r  |
>  4|false |NULL  |  23330758656|
> >
> > Peak load (write): 3000 TPS (mostly updates).
> > Peak load (read): 800 TPS.
> >
> >
> > On Sat, Mar 9, 2024 at 5:58 PM Ron Johnson 
> wrote:
> >>
> >> On Sat, Mar 9, 2024 at 7:18 AM hassan rafi 
> wrote:
> >>>
> >>> Hi team,
> >>>
> >>> We are seeing unusually high query planning times on our Postgres
> server. I am attaching a few query plans.
> >>
> >>
> >> Postgresql version number?
> >> Rows in the tables?
> >> System load?
>


Re: Insert with Jsonb column hangs

2024-03-09 Thread hector vass
On Sat, Mar 9, 2024 at 3:02 PM kuldeep singh 
wrote:

> Hi,
>
> We are inserting data close to 1M record & having a single Jsonb column
> but query is getting stuck.
>
> We are using insert into select * .. , so all the operations are within
> the DB.
>
> If we are running select query individually then it is returning the data
> in 40 sec for all rows but with insert it is getting stuck.
>
> PG Version - 15.
>
> What could be the problem here ?
>
> Regards,
> KD
>

insert 1M rows especially JSON that can be large, variable in size and
stored as blobs and indexed is not perhaps the correct way to do this
insert performance will also depend on your tuning.  Supporting
transactions, users or bulk processing are 3x sides of a compromise.
you should perhaps consider that insert is for inserting a few rows into
live tables ... you might be better using copy or \copy, pg_dump if you are
just trying to replicate a large table


Fwd: Getting error while upgrading

2024-03-09 Thread hector vass
On Sat, Mar 9, 2024 at 12:18 PM omkar narkar 
wrote:

> Hello Team,
>
> I am trying to upgrade my edb 10.5 community version to postgres 15.6
> version and while doing this i am getting error regarding OIDS are not
> stable across Postgresql version (sys.callback_queue_table.user_data).
> Kindly help me to get the solution of this issue.
>
> Thanks and regards,
> Omkar Narkar
>

Usually get this error if there are composite data types or data types that
cannot be translated between 10.5 and 15.6.
The clue may be in the error message just before it says 'OIDS are not
stable across Postgresql version'
You state edb 10.5 community guessing you are using pg_upgrade and going
from windows to linux ? I am impressed if you can do that, would you not
end up with collation issues?
If you are using pg_upgrade what does pg_upgrade --check say
I would dump the schema to a sql file
pg_dump -s >dumped.sql
Then run the sql one command at a time to track down where you are going to
have a problem.


Insert with Jsonb column hangs

2024-03-09 Thread kuldeep singh
Hi,

We are inserting data close to 1M record & having a single Jsonb column but
query is getting stuck.

We are using insert into select * .. , so all the operations are within the
DB.

If we are running select query individually then it is returning the data
in 40 sec for all rows but with insert it is getting stuck.

PG Version - 15.

What could be the problem here ?

Regards,
KD


Re: creating a subset DB efficiently ?

2024-03-09 Thread hector vass
On Fri, Mar 8, 2024 at 4:22 PM David Gauthier  wrote:

> Here's the situation
>
> - The DB contains data for several projects.
> - The tables of the DB contain data for all projects (data is not
> partitioned on project name or anything like that)
> - The "project" identifier (table column) exists in a few "parent" tables
> with many child... grandchild,... tables under them connected with foreign
> keys defined with "on delete cascade".  So if a record in one of the parent
> table records is deleted, all of its underlying, dependent records get
> deleted too.
> - New projects come in, and old ones need to be removed and "archived" in
> DBs of their own.  So there's a DB called "active_projects" and there's a
> DB called "project_a_archive" (identical metadata).
> - The idea is to copy the data for project "a" that's in "active_projects"
> to the "project_a_arhchive" DB AND delete the project a data out of
> "active_projects".
> - Leave "project_a_archive" up and running if someone needs to attach to
> that and get some old/archived data.
>
> The brute-force method I've been using is...
> 1)  pg_dump "active_projects" to a (huge) file then populate
> "project_a_archive" using that (I don't have the privs to create database,
> IT creates an empty one for me, so this is how I do it).
> 2) go into the "project_a_archive" DB and run... "delete from par_tbl_1
> where project <> 'a' ", "delete from par_tbl_2 where project <> 'a' ",
> etc... leaving only project "a" data in the DB.
> 3) go into the "active_projects" DB and "delete from par_tbl_1 where
> project = 'a' ", etc... removing project "a" from the "active_projects DB.
>
> Ya, not very elegant, it takes a long time and it takes a lot of
> resources.  So I'm looking for ideas on how to do this better.
>
> Related question...
> The "delete from par_tbl_a where project <> 'a' " is taking forever.  I
> fear it's because it's trying to journal everything in case I want to
> rollback.  But this is just in the archive DB and I don't mind taking the
> risk if I can speed this up outside of a transaction.  How can I run a
> delete command like this without the rollback recovery overhead ?
>


>(I don't have the privs to create database, IT creates an empty one for
me, so this is how I do it).

That's a shame.  You can do something similar with tablespaces
  Template your existing schema to create a new schema for the project
(pg_dump -s)
  Create tablespace for this new project and schema

 You can then move the physical tablespace to cheaper disk and use symbolic
links or... archive and/or back it up at the schema level with pg_dump -n

...as long as you don't put anything in the public schema all you are
really sharing is roles otherwise a bit like a separate database


Re: Seeing high query planning time on Azure Postgres Single Server version 11.

2024-03-09 Thread Robert Treat
It'd be worth checking that your default_statistics_target isn't set
to anything wild, but beyond that, it'd be interesting to look at the
output of vacuum verbose on some of the system catalogs as istm you
might have catalog bloat.

I should also mention that you're running a non-longer-supported
version of Postgres (v11) and not even the latest release of said EOL
version. And if I am not mistaken, "Azure Postgres single server
version" is also deprecated, so you should really focus on getting
upgraded to something more modern.

Robert Treat
https://xzilla.net

On Sat, Mar 9, 2024 at 8:12 AM hassan rafi  wrote:
>
> Postgres version: PostgreSQL 11.18, compiled by Visual C++ build 1800, 64-bit
> relname  
> |relpages|reltuples|relallvisible|relkind|relnatts|relhassubclass|reloptions|pg_table_size|
> -++-+-+---++--+--+-+
> store_seller_products|16007942|843460096|   797033|r  |  16|false 
> |NULL  | 131980795904|
>
>
> relname 
> |relpages|reltuples|relallvisible|relkind|relnatts|relhassubclass|reloptions|pg_table_size|
> ++-+-+---++--+--+-+
> products_inventory_delta| 2847202|259351648|  1606201|r  |   
> 4|false |NULL  |  23330758656|
>
> Peak load (write): 3000 TPS (mostly updates).
> Peak load (read): 800 TPS.
>
>
> On Sat, Mar 9, 2024 at 5:58 PM Ron Johnson  wrote:
>>
>> On Sat, Mar 9, 2024 at 7:18 AM hassan rafi  wrote:
>>>
>>> Hi team,
>>>
>>> We are seeing unusually high query planning times on our Postgres server. I 
>>> am attaching a few query plans.
>>
>>
>> Postgresql version number?
>> Rows in the tables?
>> System load?




Re: Seeing high query planning time on Azure Postgres Single Server version 11.

2024-03-09 Thread hassan rafi
Postgres version: PostgreSQL 11.18, compiled by Visual C++ build 1800,
64-bit
relname
 
|relpages|reltuples|relallvisible|relkind|relnatts|relhassubclass|reloptions|pg_table_size|
-++-+-+---++--+--+-+
store_seller_products|16007942|843460096|   797033|r  |
 16|false |NULL  | 131980795904|


relname
|relpages|reltuples|relallvisible|relkind|relnatts|relhassubclass|reloptions|pg_table_size|
++-+-+---++--+--+-+
products_inventory_delta| 2847202|259351648|  1606201|r  |
4|false |NULL  |  23330758656|

Peak load (write): 3000 TPS (mostly updates).
Peak load (read): 800 TPS.


On Sat, Mar 9, 2024 at 5:58 PM Ron Johnson  wrote:

> On Sat, Mar 9, 2024 at 7:18 AM hassan rafi 
> wrote:
>
>> Hi team,
>>
>> We are seeing unusually high query planning times on our Postgres server.
>> I am attaching a few query plans.
>>
>
> Postgresql version number?
> Rows in the tables?
> System load?
>


Re: Seeing high query planning time on Azure Postgres Single Server version 11.

2024-03-09 Thread Ron Johnson
On Sat, Mar 9, 2024 at 7:18 AM hassan rafi  wrote:

> Hi team,
>
> We are seeing unusually high query planning times on our Postgres server.
> I am attaching a few query plans.
>

Postgresql version number?
Rows in the tables?
System load?


Getting error while upgrading

2024-03-09 Thread omkar narkar
Hello Team,

I am trying to upgrade my edb 10.5 community version to postgres 15.6
version and while doing this i am getting error regarding OIDS are not
stable across Postgresql version (sys.callback_queue_table.user_data).
Kindly help me to get the solution of this issue.

Thanks and regards,
Omkar Narkar


Seeing high query planning time on Azure Postgres Single Server version 11.

2024-03-09 Thread hassan rafi
Hi team,

We are seeing unusually high query planning times on our Postgres server. I
am attaching a few query plans.

select upc from store_seller_products where upc in
('0001600015840','0001600015781','0001600015777','0001600015765','0001600015764','0001600015762','0001600015483','0001600015163','0001600015128','0001600014943','0001600014733','0001600014732','0001600014711','0001600014665','0001600014599','0001600014481','0001600013977','0001600013310','0001600012929','0001600012928','0001600012685','0001600012593','0001600012541','0001600012506','0001600012499','0001600012495','0001600012479','0001600012399','0001600012254','0001600012224','000160001','0001600012185','0001600012183','0001600012125','0001600011610','0001600010810','0001600010710','0001600010640','0001600010610','0001600010430','0001600010410','0001600010371','0001595898049','0001595370752','0001595370750','0001595370713','0001590023565','000159211','000159209','0001583909712')
and store_id = '70500101' and pickup = true;

Index Only Scan using idx_store_seller_products_upc_store_id_delivery_p_tmp
on store_seller_products  (cost=0.70..99.38 rows=8 width=14) (actual
time=10.694..142.050 rows=91 loops=1)
  Index Cond: ((upc = ANY
('{0001600015840,0001600015781,0001600015777,0001600015765,0001600015764,0001600015762,0001600015483,0001600015163,0001600015128,0001600014943,0001600014733,0001600014732,0001600014711,0001600014665,0001600014599,0001600014481,0001600013977,0001600013310,0001600012929,0001600012928,0001600012685,0001600012593,0001600012541,0001600012506,0001600012499,0001600012495,0001600012479,0001600012399,0001600012254,0001600012224,000160001,0001600012185,0001600012183,0001600012125,0001600011610,0001600010810,0001600010710,0001600010640,0001600010610,0001600010430,0001600010410,0001600010371,0001595898049,0001595370752,0001595370750,0001595370713,0001590023565,000159211,000159209,0001583909712}'::text[]))
AND (store_id = '70500891'::text))
  Heap Fetches: 91
  Buffers: shared hit=314 read=184
  I/O Timings: read=129.218
Planning Time: 24797.421 ms
Execution Time: 142.131 ms
explain (analyze, verbose, buffers) SELECT products_inventory_delta.upc
FROM products_inventory_delta WHERE products_inventory_delta.modality =
'pickup' AND products_inventory_delta.store_id = '7038' ORDER BY upc
DESC LIMIT 51 OFFSET 0;


 Limit  (cost=0.57..54.37 rows=51 width=14) (actual time=27.676..27.831
rows=51 loops=1)
   Output: upc
   Buffers: shared hit=93
   ->  Index Only Scan Backward using products_inventory_delta_pkey on
public.products_inventory_delta  (cost=0.57..30625.26 rows=29030 width=14)
(actual time=27.674..27.824 rows=51 loops=1)
 Output: upc
 Index Cond: ((products_inventory_delta.store_id =
'7038'::text) AND (products_inventory_delta.modality =
'pickup'::modality))
 Heap Fetches: 50
 Buffers: shared hit=93
 Planning Time: 6142.094 ms
 Execution Time: 27.884 ms


explain (analyze, buffers) SELECT products_inventory_delta.upc FROM
products_inventory_delta WHERE products_inventory_delta.modality = 'pickup'
AND products_inventory_delta.store_id = '7038' ORDER BY upc DESC LIMIT
51 OFFSET 0;

 Limit  (cost=0.57..50.96 rows=51 width=14) (actual time=12.290..12.442
rows=51 loops=1)
   Buffers: shared hit=93
   ->  Index Only Scan Backward using products_inventory_delta_pkey on
products_inventory_delta  (cost=0.57..28164.01 rows=28502 width=14) (actual
time=12.285..12.433 rows=51 loops=1)
 Index Cond: ((store_id = '7038'::text) AND (modality =
'pickup'::modality))
 Heap Fetches: 53
 Buffers: shared hit=93
 Planning Time: 1165.382 ms
 Execution Time: 12.522 ms

Schema:

CREATE TABLE public.products_inventory_delta (
upc text NOT NULL,
store_id text NOT NULL,
modality public.modality NOT NULL,
updated_at timestamp NOT NULL,
CONSTRAINT products_inventory_delta_pkey PRIMARY KEY (store_id, modality,
upc)
);
CREATE INDEX store_modality_updates ON public.products_inventory_delta
USING btree (store_id, modality, updated_at);


CREATE TABLE public.store_seller_products (
id int8 NOT NULL GENERATED ALWAYS AS IDENTITY( INCREMENT BY 1 MINVALUE 1
MAXVALUE 9223372036854775807 START 1 CACHE 1 NO CYCLE),
upc text NULL,
store_id text NULL,
seller_id text NULL,
delivery bool NULL,
ship bool NULL,
instore bool NULL,
pickup bool NULL,
modality_changed_at timestamp NULL,
price_changed_at timestamp NULL,
national_price_changed_at timestamp NULL,
stock_level_changed_at timestamp NULL,
created_at timestamp NULL,
updated_at timestamp NULL,
product_core_info_updated_at timestamp NULL,
regional_price_changed_at timestamp NULL,
CONSTRAINT store_seller_products_pkey PRIMARY KEY (id)
);
CREATE INDEX idx_store_seller_products_store_id_instore_upc_p_instore_tmp
ON public.store_seller_products USING btree (upc, store_id) INCLUDE
(seller_id, modality_changed_at, price_changed_at, stock_level_changed_at,
updated_at, product_core_info_updated_at) WHERE (instore = true);
CREATE