Re: Advice needed: query performance deteriorates by 2000% within 1 minute

2022-02-28 Thread Peter Adlersburg
Hello,

Michael, Tom: thanks for all the insights and informations in your previous
mails.

A quick update of the explain outputs (this time using explain (analyze,
buffers, verbose))

*The good: *

*LOG Time: | 2022-02-28 09:30:01.400777+01 | order rows: |  9668*



 Limit  (cost=616.37..653.30 rows=10 width=22) (actual time=1.062..1.063
rows=1 loops=1)

   Output: version, content

   Buffers: shared hit=154

   ->  Bitmap Heap Scan on orderstore."order"  (cost=616.37..793.63 rows=48
width=22) (actual time=1.061..1.062 rows=1 loops=1)

 Output: version, content

 Recheck Cond: (jsonb_to_tsvector('english'::regconfig,
"order".content, '["all"]'::jsonb) @@ '''1.20709841'''::tsquery)

 Heap Blocks: exact=1

 Buffers: shared hit=154

 ->  Bitmap Index Scan on idx_fulltext_content  (cost=0.00..616.36
rows=48 width=0) (actual time=1.053..1.053 rows=1 loops=1)

   Index Cond: (jsonb_to_tsvector('english'::regconfig,
"order".content, '["all"]'::jsonb) @@ '''1.20709841'''::tsquery)

   Buffers: shared hit=153

Planning:

   Buffers: shared hit=50

Planning Time: 0.408 ms
*Execution Time: 1.079 ms*

*pg_stat_all_tables: *


n_tup_ins   | 102

*n_tup_upd   | 1554*

n_tup_del   | 0

n_tup_hot_upd   | 0

n_live_tup  | 9668

*n_dead_tup  | 69*

n_mod_since_analyze | 61

n_ins_since_vacuum  | 8

last_vacuum | 2022-02-25 07:54:46.196508+01

last_autovacuum |

last_analyze| 2022-02-28 03:20:38.761482+01

last_autoanalyze|


*The bad: *


*LOG Time: | 2022-02-28 09:45:01.662702+01 | order rows: |  9668*



LIMIT 10:



Limit  (cost=0.00..805.63 rows=10 width=22) (actual
time=24175.964..25829.767 rows=1 loops=1)

   Output: version, content

   Buffers: shared hit=26284 read=12550 dirtied=4

   ->  Seq Scan on orderstore."order"  (cost=0.00..3867.01 rows=48
width=22) (actual time=24175.962..25829.763 rows=1 loops=1)

 Output: version, content

 Filter: (jsonb_to_tsvector('english'::regconfig, "order".content,
'["all"]'::jsonb) @@ '''1.20709841'''::tsquery)

 Rows Removed by Filter: 9667

 Buffers: shared hit=26284 read=12550 dirtied=4

Planning:

   Buffers: shared hit=50

Planning Time: 0.377 ms

*Execution Time: 25829.778 ms*

*pg_stat_all_tables:*

n_tup_ins   | 102

*n_tup_upd   | 1585*

n_tup_del   | 0

n_tup_hot_upd   | 0

n_live_tup  | 9668

*n_dead_tup  | 100*

n_mod_since_analyze | 92

n_ins_since_vacuum  | 8

last_vacuum | 2022-02-25 07:54:46.196508+01

last_autovacuum |

last_analyze| 2022-02-28 03:20:38.761482+01

last_autoanalyze|


*The ugly:*


It should be mentioned that the table in question mainly lives in toast
land (but I have no idea if this also influences the query planner):


oid   | table_schema | table_name | row_estimate | total_bytes |
index_bytes | toast_bytes | table_bytes | total  | index  | toast | table
+--++--+-+-+-+-+++---+---
 155544 | orderstore   | order  | 9570 |   229826560 |
120184832 |98557952 |11083776 | 219 MB | 115 MB | 94 MB | 11 MB


Since tinkering with the text search functions is out of the question we
came up with three possibilities on how to deal with this issue:

- significantly increase the limit clause or omit it at all (meh ...)
- use 'set random_page_cost = 0.5'  in the transaction in order to convince
the query planner to prefer the index (tested and works)
- schedule an hourly vacuum job for  the table (the most likely solution we
will settle on since it comes with the least implementation effort)

None of these seems very elegant or viable in the long run ... we'll see.

Ah, yes: our global settings for random_page_cost and autovacuum/analyze
are set to the defaults.

 Will json-processing experience some improvements in pg14/15? We are about
to update to 14 in the near future with our devs saying that this topic is
the main trigger to do so.

Any further thoughts on the case are very much appreciated.

kr p.


Am Do., 24. Feb. 2022 um 17:10 Uhr schrieb Tom Lane :

> Peter Adlersburg  writes:
> >  Limit  (cost=0.00..804.97 rows=10 width=22) (actual
> > time=23970.845..25588.432 rows=1 loops=1)
> >->  Seq Scan on "order"  (cost=0.00..3863.86 rows=48 width=22) (actual
> > time=23970.843..25588.429 rows=1 loops=1)
> >  Filter: (jsonb_to_tsvector('english'::regconfig, content,
> > '["all"]'::jsonb) @@ '''1.20709841'''::tsquery)
> >  Rows Removed by Filter: 9652
> >  Planning Time: 0.430 ms
> >  Execution Time: 25588.448 ms
>
> I think the expense here comes from re-executing jsonb_to_tsvector
> a lot of times.  By default that's estimated as 100 times more expensive
> than a simple function (such as addition), but these results make it
> seem like that's an understatement.  You 

Re: Never Ending query in PostgreSQL

2022-02-28 Thread Mladen Gogala

On 2/27/22 12:20, Jeff Janes wrote:

Several hours is not all that long.


Well, the pyramids in the Valley of the Kings last for around 4500 
years. Dinosaurs have ruled the Earth for approximately 120 million 
years. Solar system is 5 billion years old. Cosmos is around 13 billion 
years old. Compared to those numbers, indeed, several hours isn't that 
long. Furthermore, you have to account for the time dilatation. One hour 
on the planet that's evolving and revolving at 900 miles an hour is not 
the same as one hour of standing still. To make things even more 
interesting, it's orbiting at 19 miles a second, so it's reckoned,The 
sun that is the source of all our power. So, several hours is relative.  
Each object has its relative time so it's not possible to conclude 
whether several hours is a long time or not.


--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com


Re: Never Ending query in PostgreSQL

2022-02-28 Thread Mladen Gogala

On 2/26/22 23:40, Kumar, Mukesh wrote:

Hi Team,

Can you please help in tunning the attached query as , i am trying to 
run this query and it runs for several hours and it did not give any 
output.


I am not able to generate the explain analyze plan as well and it 
keeps on running for several hours and did not give output.


I have attached the query and explain plan without analyze. Please 
help if nayone has any idea how to tune that query.


Regards,
Mukesh Kumar



Hi Team Member,

Your attachments are not SQL, they are plans. Judging by the size of the 
plans, your best course of action is to completely rewrite the queries, 
probably using CTE and temporary tables. May the Force be with you.


Regards

--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com


Re: An I/O error occurred while sending to the backend (PG 13.4)

2022-02-28 Thread Justin Pryzby
On Mon, Feb 28, 2022 at 09:43:09PM +, l...@laurent-hasson.com wrote:
>On Wed, Feb 23, 2022 at 07:04:15PM -0600, Justin Pryzby wrote:
>>  > And the aforementioned network trace.  You could set a capture filter 
> on TCP
>>  > SYN|RST so it's not absurdly large.  From my notes, it might look 
> like this:
>>  > (tcp[tcpflags]&(tcp-rst|tcp-syn|tcp-fin)!=0)
>>  
>>  I'd also add '|| icmp'.  My hunch is that you'll see some ICMP (not 
> "ping")
>>  being sent by an intermediate gateway, resulting in the connection being
>>  reset.
> 
> I am so sorry but I do not understand what you are asking me to do. I am 
> unfamiliar with these commands. Is this a postgres configuration file? Is 
> this something I just do once or something I leave on to hopefully catch it 
> when the issue occurs? Is this something to do on the DB machine or the ETL 
> machine? FYI:

It's no problem.

I suggest that you run wireshark with a capture filter to try to show *why* the
connections are failing.  I think the capture filter might look like:

(icmp || (tcp[tcpflags] & (tcp-rst|tcp-syn|tcp-fin)!=0)) && host 10.64.17.211

With the "host" filtering for the IP address of the *remote* machine.

You could run that on whichever machine is more convenient and leave it running
for however long it takes for that error to happen.  You'll be able to save a
.pcap file for inspection.  I suppose it'll show either a TCP RST or an ICMP.
Whichever side sent that is where the problem is.  I still suspect the issue
isn't in postgres.

>   - My ETL machine is on 10.64.17.211
>   - My DB machine is on 10.64.17.210
>   - Both on Windows Server 2012 R2, x64

These network details make my theory unlikely.

They're on the same subnet with no intermediate gateways, and communicate
directly via a hub/switch/crossover cable.  If that's true, then both will have
each other's hardware address in ARP after pinging from one to the other.

-- 
Justin




RE: An I/O error occurred while sending to the backend (PG 13.4)

2022-02-28 Thread l...@laurent-hasson.com


   >  -Original Message-
   >  From: Justin Pryzby 
   >  Sent: Friday, February 25, 2022 08:02
   >  To: l...@laurent-hasson.com
   >  Cc: pgsql-performa...@postgresql.org
   >  Subject: Re: An I/O error occurred while sending to the backend (PG 13.4)
   >  
   >  On Wed, Feb 23, 2022 at 07:04:15PM -0600, Justin Pryzby wrote:
   >  > And the aforementioned network trace.  You could set a capture filter
   >  > on TCP
   >  > SYN|RST so it's not absurdly large.  From my notes, it might look like 
this:
   >  > (tcp[tcpflags]&(tcp-rst|tcp-syn|tcp-fin)!=0)
   >  
   >  I'd also add '|| icmp'.  My hunch is that you'll see some ICMP (not 
"ping")
   >  being sent by an intermediate gateway, resulting in the connection being
   >  reset.
   >  
   >  --
   >  Justin


Hello Justin,

I am so sorry but I do not understand what you are asking me to do. I am 
unfamiliar with these commands. Is this a postgres configuration file? Is this 
something I just do once or something I leave on to hopefully catch it when the 
issue occurs? Is this something to do on the DB machine or the ETL machine? FYI:

  - My ETL machine is on 10.64.17.211
  - My DB machine is on 10.64.17.210
  - Both on Windows Server 2012 R2, x64

So sorry for the bother,
Laurent.






RE: An I/O error occurred while sending to the backend (PG 13.4)

2022-02-28 Thread l...@laurent-hasson.com


>From: Ranier Vilela  
>Sent: Thursday, February 24, 2022 08:46
>To: Justin Pryzby 
>Cc: l...@laurent-hasson.com; pgsql-performa...@postgresql.org
>Subject: Re: An I/O error occurred while sending to the backend (PG 13.4) 
>
>Em qui., 24 de fev. de 2022 às 09:59, Justin Pryzby 
> escreveu:
>On Thu, Feb 24, 2022 at 08:50:45AM -0300, Ranier Vilela wrote:
>> I can't understand why you are still using 13.4?
>> [1] There is a long discussion about the issue with 13.4, the project was
>> made to fix a DLL bottleneck.
>> 
>> Why you not use 13.6?
>
>That other problem (and its fix) were in the windows build environment, and not
>an issue in some postgres version.
>Yeah, correct.
>But I think that it was very clear in the other thread that version 13.4, 
>on Windows, may have a slowdown, because of the DLL problem.
>So it would be better to use the latest available version 
>that has this specific fix and many others.
>
>regards,
>Ranier Vilela


OK, absolutely. I was thinking about even moving to 14. I know migrations 
within a release are painless, but my experience with upgrading across releases 
has also been quite good (short of bugs that were found of course). Any opinion 
on 14.2?

Thank you, Laurent.