from:"Justin Pryzby"

Re: speeding up grafana sensor-data query on raspberry pi 3

2023-04-16 Thread Justin Pryzby

On Sun, Apr 16, 2023 at 07:00:33PM +0200, Clemens Eisserer wrote:
> Hi,
> 
> I am currently trying to migrate an influxdb 1.7 smarthome database to
> postgresql (13.9) running on my raspberry 3.
> It works quite well, but for the queries executed by grafana I get a
> bit highter execution times than I'd hoped for.

Suggestions:

 - enable track_io_timing and show explain (analyze,buffers,settings)
 - or otherwise show your non-default settings;
 - show \d of your table(s)
 - show the query plan for the 6 months query .  The query plan may be
   different, or (if you can run it with "analyze") it may be
   illuminating to see how the query "scales".
 - consider trying postgres 15 (btw, v16 will have a beta release next
   month)

https://wiki.postgresql.org/wiki/Slow_Query_Questions

-- 
Justin

Re: multicolumn partitioning help

2023-03-14 Thread Justin Pryzby

On Sun, Mar 12, 2023 at 01:59:32PM -0400, James Robertson wrote:
> Hey folks,
> I am having issues with multicolumn partitioning. For reference I am using
> the following link as my guide:
> https://www.postgresql.org/docs/devel/sql-createtable.html
> 
> Reading the documentation:   "When creating a range partition, the lower
> bound specified with FROM is an inclusive bound, whereas the upper bound
> specified with TO is an exclusive bound".
> 
> However I can't insert any of the following after the first one, because it
> says it overlaps. Do I need to do anything different when defining
> multi-column partitions?

The bounds are compared like rows:

When creating a range partition, the lower bound specified with FROM is
an inclusive bound, whereas the upper bound specified with TO is an
exclusive bound. That is, the values specified in the FROM list are
valid values of the corresponding partition key columns for this
partition, whereas those in the TO list are not. Note that this
statement must be understood according to the rules of row-wise
comparison (Section 9.24.5). For example, given PARTITION BY RANGE
(x,y), a partition bound FROM (1, 2) TO (3, 4) allows x=1 with any y>=2,
x=2 with any non-null y, and x=3 with any y<4.

https://www.postgresql.org/docs/current/functions-comparisons.html#ROW-WISE-COMPARISON

> This works:
> CREATE TABLE humans_1968_0 PARTITION OF humans FOR VALUES FROM (1968, '0')
> TO (1969, '1');

This table is everything from 1968 (starting with '0') to 1969

> These fail:
> CREATE TABLE humans_1968_1 PARTITION OF humans FOR VALUES FROM (1968, '1')
> TO (1969, '2');

Which is why these are overlapping.

> CREATE TABLE humans_1969_1 PARTITION OF humans FOR VALUES FROM (1969, '1')
> TO (1970, '2');

This one doesn't fail, but it "occupies" / subjugates all of 1969
starting with 1.

-- 
Justin

Re: BRIN index worse than sequential scan for large search set

2023-02-24 Thread Justin Pryzby

On Fri, Feb 24, 2023 at 06:51:00PM +0100, Mickael van der Beek wrote:
> Hello Justin,
> 
> Thanks for the quick response!
> 
> > The table may be dense, but the tuples aren't.  You're asking to return
> > 1/1000th of the tuples, across the entire table.  Suppose there are ~100
> > tuples per page, and you need to read about every 10th page.  It makes
> > sense that it's slow to read a large amount of data nonsequentially.
> 
> Ah, of course, you're right!
> I forgot that the BRIN indexes store ranges that are not fully covered by
> the row values and that PostgreSQL has to double-check (bitmap heap scan)
> ...
> Would you thus advise to only use BRIN indexes for columns who's values are
> (1) monotonically increasing but also (2) close to each other?

It's not important whether they're "rigidly" monotonic (nor "strictly").
What's important is that a query doesn't need to access a large number
of pages.

For example, some of the BRIN indexes that I'm familiar with are created
on a column called "start time", but the table's data tends to be
naturally sorted by "end time" - and that's good enough.  If someone
queries for data between 12pm and 1pm, there's surely no data for the
first 12 hours of the day's table (because it hadn't happened yet) and
there's probably no data for the last 9+ hours of the day, either, so
it's only got to read data for a 1-2h interval in the middle.  This
assumes that the column's data is typically correlated.  If the tuples
aren't clustered/"close to each other" then it probably doesn't work
well.  I haven't played with brin "multi minmax", though.

> > >2. Since we only select the "idx" column, why does the BRIN index not
> > >simply return the searched value if included in one of it's ranges?
> > >Hitting the actual row data stored in the table seems to be unnessary 
> > > no?
> >
> > Because it's necessary to check if the tuple is visible to the current
> > transaction.  It might be from an uncommited/aborted transaction.

Actually, a better explanation is that all the brin scan returns is the page,
and not the tuples.

"BRIN indexes can satisfy queries via regular bitmap index scans, and
will return all tuples in all pages within each range if the summary
info stored by the index is CONSISTENT with the query conditions.  The
query executor is in charge of rechecking these tuples and discarding
those that do not match the query conditions — in other words, these
indexes are LOSSY".

The index is returning pages where matching tuples *might* be found,
after excluding those pages where it's certain that no tuples are found.

-- 
Justin

Re: BRIN index worse than sequential scan for large search set

2023-02-24 Thread Justin Pryzby

On Fri, Feb 24, 2023 at 05:40:55PM +0100, Mickael van der Beek wrote:
> Hello everyone,
> 
> I'm playing around with BRIN indexes so as to get a feel for the feature.
> During my tests, I was unable to make BRIN indexes perform better than a
> sequential scan for queries searching for large value sets (20K values in
> the example down below).

> And now let's query 20K random rows from our 20M total rows:

I didn't try your test, but I think *random* is the problem/explanation.

> By default, this query will not use the BRIN index and simply run a 1.5s
> long sequential scan (hitting 700 MB) and a 2.47s hash join for a total
> 8.7s query time:
> https://explain.dalibo.com/plan/46c3191g8a6c1bc7

> If we force the use of the BRIN index using (`SET LOCAL enable_seqscan =
> OFF;`) the same query will now take 50s with 2.5s spent on the bitmap index
> scan (hitting 470 MB of data) and a whopping 42s on the bitmap heap scan
> (hitting 20 GB of data!):
> https://explain.dalibo.com/plan/7f73bg9172a8b226

That means the planner's cost model correctly preferred a seq scan.

> So I had the following two questions:
> 
>1. Why is the BRIN index systematically worse than a sequential scan,
>even when the table is x1000 larger than the search set, physically
>pre-sorted, dense (fillfactor at 100%) and the search rows are themselves
>sorted?

The table may be dense, but the tuples aren't.  You're asking to return
1/1000th of the tuples, across the entire table.  Suppose there are ~100
tuples per page, and you need to read about every 10th page.  It makes
sense that it's slow to read a large amount of data nonsequentially.
That's why random_page_cost is several times higher than seq_page_cost.

I would expect brin to win if the pages to be accessed were dense rather
than distributed across the whole table.

>2. Since we only select the "idx" column, why does the BRIN index not
>simply return the searched value if included in one of it's ranges?
>Hitting the actual row data stored in the table seems to be unnessary no?

Because it's necessary to check if the tuple is visible to the current
transaction.  It might be from an uncommited/aborted transaction.

-- 
Justin

Re: Getting an index scan to be a parallel index scan

2023-02-01 Thread Justin Pryzby

On Wed, Feb 01, 2023 at 11:22:47AM -0800, Alex Kaiser wrote:
> I've never messed around with extended statistics, but I'm not sure how
> they would help here. From what I've read they seem to help when your query
> is restricting over multiple columns. Since this query is only on one
> column I'm not sure what a good "CREATE STATISTICS ..." command to run
> would be to improve the query plan. Any suggestions?

They wouldn't help.  It seems like that was a guess.

> As for how I found 'force_parallel_mode', I think I found it first here:
> https://postgrespro.com/list/thread-id/2574997 and then I also saw it when
> I was searching for 'parallel' on https://postgresqlco.nf .

Yeah.  force_parallel_mode is meant for debugging, only, and we're
wondering how people end up trying to use it for other purposes.

http://rhaas.blogspot.com/2018/06/using-forceparallelmode-correctly.html

Did you try adjusting min_parallel_index_scan_size /
min_parallel_table_scan_size ?

-- 
Justin

Re: Database Stalls

2023-01-30 Thread Justin Pryzby

On Mon, Jan 30, 2023 at 05:47:49PM +, Mok wrote:
> Hi,
> 
> We've started to observe instances of one of our databases stalling for a
> few seconds.
> 
> We see a spike in wal write locks then nothing for a few seconds. After
> which we have spike latency as processes waiting to get to the db can do
> so.
> 
> There is nothing in the postgres logs that give us any clues to what could
> be happening, no locks, unusually high/long running transactions, just a
> pause and resume.
> 
> Could anyone give me any advice as to what to look for when it comes to
> checking the underlying disk that the db is on?

What version postgres?  What settings have non-default values ? 


What OS/version?  What environment/hardware?  VM/image/provider/... 



Have you enabled logging for vacuum/checkpoints/locks ?

https://wiki.postgresql.org/wiki/Slow_Query_Questions

Re: Advice on best way to store a large amount of data in postgresql

2023-01-09 Thread Justin Pryzby

On Sun, Jan 08, 2023 at 07:02:01AM -0500, spiral wrote:
> This table is used essentially as a key-value store; rows are accessed
> only with `mid` primary key. Additionally, inserted rows may only be
> deleted, but never updated.
> 
> We only run the following queries:
> - INSERT INTO messages VALUES (...data...);
> - SELECT * FROM messages WHERE mid = $1;
> - DELETE FROM messages WHERE mid = $1;
> - DELETE FROM messages WHERE mid IN ($1...$n);
> - SELECT count(*) FROM messages;

Great - it's good to start with the queries to optimize.

Are you using the extended query protocol with "bind" parameters, or are they
escaped and substituted by the client library ?

> So, the problem: I don't know how to best store this data in
> postgres, or what system requirements would be needed.
> Originally, this table did not contain a substantial amount of data,
> and so I stored it in the same database as our CRUD user data. However,
> as the table became larger, cache was being allocated to (mostly
> unused) historical data from the `messages` table, and I decided to
> move the large table to its own postgres instance.
> 
> At the same time, I partitioned the table, with TimescaleDB's automatic
> time-series partitioning, because our data is essentially time-series
> (`mid` values are Twitter-style snowflakes) and it was said that
> partitioning would improve performance.
> This ended up being a mistake... shared_buffers memory usage went way
> up, from the 20GB of the previous combined database to 28GB for just
> the messages database, and trying to lower shared_buffers at all made
> the database start throwing "out of shared memory" errors when running
> DELETE queries. A TimescaleDB update did improve this, but 28GB is way
> more memory than I can afford to allocate to this database - instead of
> "out of shared memory", it gets OOM killed by the system.

Can you avoid using DELETE and instead use DROP ?  I mean, can you
arrange your partitioning such that the things to be dropped are all in
one partition, to handle in bulk ?  That's one of the main reasons for
using partitioning.

(Or, as a worse option, if you need to use DELETE, can you change the
query to DELETE one MID at a time, and loop over MIDs?)

What version of postgres is it ?  Ah, I found that you reported the same thing
at least one other place.  (It'd be useful to include here that information as
well as the prior discussion with other product/vendor).

https://github.com/timescale/timescaledb/issues/5075

In this other issue report, you said that you increased
max_locks_per_transaction.  I suppose you need to increase it further,
or decrease your chunk size.  How many "partitions" do you have
(actually, timescale uses inheritance) ?

-- 
Justin

Re: How to analyze of short but heavy intermittent slowdown on BIND on production database (or BIND vs log_lock_waits)

2022-12-31 Thread Justin Pryzby

On Sat, Dec 31, 2022 at 02:26:08PM +0200, Maxim Boguk wrote:
> Hi,
> 
> When performing post-mortem analysis of some short latency spikes on a
> heavily loaded database, I found that the reason for (less than 10 second
> latency spike) wasn't on the EXECUTE stage but on the BIND stage.
> At the same time graphical monitoring shows that during this few second
> period there were some queries waiting in the BIND stage.
> 
> Logging setup:
> log_min_duration_statement=200ms
> log_lock_waits=on
> deadlock_timeout=100ms
> So I expected that every lock waiting over 100ms (>deadlock_timeout) should
> be in the log.
> But in the log I see only spikes on slow BIND but not lock waits logged.

What version postgres?  What settings have non-default values ?
What OS/version?  What environment/hardware?  VM/image/provider/...
What are the queries that are running BIND ?  What parameter types ?
Are the slow BINDs failing?  Are their paramters being logged ?
What else is running besides postgres ?  Are the DB clients local or
remote ?  It shouldn't matter, but what client library?

https://wiki.postgresql.org/wiki/Slow_Query_Questions

-- 
Justin

Re: When you really want to force a certain join type?

2022-12-28 Thread Justin Pryzby

On Wed, Dec 28, 2022 at 10:39:14AM -0500, Gunther Schadow wrote:
> I have a complex query which essentially runs a finite state automaton
> through a with recursive union, adding the next state based on the
> previous.  This is run at 100,000 or a million start states at the same
> time, picking a new record (token), matching it to the FSA (a three-way
> join:

> There are 100s of thousands of states. This join has a HUGE fan out if it is

> I doubt that I can find any trick to give to the planner better data which
> it can then use to figure out that the merge join is a bad proposition.

> Note, for my immediate relief I have forced it by simply set
> enable_mergejoin=off. This works fine, except, it converts both into a
> nested loop, but the upper merge join was not a problem, and sometimes (most
> often) nested loop is a bad choice for bulk data. It's only for this
> recursive query it sometimes makes sense.

Maybe the new parameter in v15 would help.

https://www.postgresql.org/docs/15/runtime-config-query.html#GUC-RECURSIVE-WORKTABLE-FACTOR
recursive_worktable_factor (floating point)

Sets the planner's estimate of the average size of the working table
of a recursive query, as a multiple of the estimated size of the
initial non-recursive term of the query. This helps the planner
choose the most appropriate method for joining the working table to
the query's other tables. The default value is 10.0. A smaller value
such as 1.0 can be helpful when the recursion has low “fan-out” from
one step to the next, as for example in shortest-path queries. Graph
analytics queries may benefit from larger-than-default values.

-- 
Justin

Re: Fwd: temp_file_limit?

2022-12-20 Thread Justin Pryzby

On Mon, Dec 19, 2022 at 09:10:27PM +0100, Frits Jalvingh wrote:
> @justin
> 
> I tried the create statistics variant and that definitely improves the
> estimate, and with that one of the "bad" cases (the one with the 82 minute
> plan) now creates a good plan using only a few seconds.
> That is a worthwhile path to follow. A bit hard to do, because those
> conditions can be anything, but I can probably calculate the ones used per
> customer and create those correlation statistics from that... It is
> definitely better than tweaking the "poor man's query hints" enable_
> 8-/ which is really not helping with plan stability either.
> 
> That will be a lot of work, but I'll let you know the results ;)

Yeah, if the conditions are arbitrary, then it's going to be more
difficult.  Hopefully you don't have too many columns.  :)

I suggest enabling autoexplain and monitoring for queries which were
slow, and retroactively adding statistics to those columns which are
most-commonly queried, and which have correlations (which the planner
doesn't otherwise know about).

You won't want to have more than a handful of columns in a stats object
(since it requires factorial(N) complexity), but you can have multiple
stats objects with different combinations of columns (and, in v14,
expressions).  You can also set a lower stats target to make the cost a
bit lower.

You could try to check which columns are correlated, either by running:
| SELECT COUNT(1),col1,col2 FROM tbl GROUP BY 2,3 ORDER BY 1;
for different combinations of columns.

Or by creating a tentative/experimental stats object on a handful of
columns at a time for which you have an intuition about their
correlation, and then checking the calculated dependencies FROM
pg_stats_ext.  You may need to to something clever to use that for
arbitrarily columns.  Maybe this is a start.
| SELECT dep.value::float, tablename, attnames, dep.key, exprs FROM (SELECT 
(json_each_text(dependencies::text::json)).* AS dep, * FROM pg_stats_ext)dep 
WHERE dependencies IS NOT NULL ORDER BY 1 DESC ; -- AND regexp_count(key, ',') 
< 2

-- 
Justin

Re: Fwd: temp_file_limit?

2022-12-19 Thread Justin Pryzby

On Sun, Dec 18, 2022 at 06:29:41PM +0100, Frits Jalvingh wrote:
> By itself I'm used to bad query performance in Postgresql; our application
> only does bulk queries and Postgres quite often makes terrible plans for
> those, but with set enable_nestloop=false set always most of them at least
> execute. The remaining failing queries are almost 100% caused by bad join
> sequences; I plan to work around those by forcing the join order from our
> application. For instance, the exact same query above can also generate the
> following plan (this one was created by manually setting
> join_collapse_limit = 1, but fast variants also occur quite often when
> disabling parallelism):

I, too, ended up setting enable_nestloop=false for our report queries,
to avoid the worst-case plans.

But you should also try to address the rowcount misestimates.  This
underestimates the rowcount by a factor of 69 (or 138 in the plan you
sent today):

| (soort = 'MSL'::text) AND (code = 'DAE'::text)

If those conditions are correlated, you can improve the estimate by
adding extended stats object.

| CREATE STATISTICS s_h_sturingslabel_ssm_stats soort,code FROM 
s_h_sturingslabel_ssm; ANALYZE s_h_sturingslabel_ssm;

Unfortunately, stats objects currently only improve scans, and not
joins, so that might *improve* some queries, but it won't resolve the
worst problems:

| Hash Join (cost=22,832.23..44,190.21 rows=185 width=47) (actual 
time=159.725..2,645,634.918 rows=28,086,472,886 loops=1) 

Maybe you can improve that by adjusting the stats target or ndistinct...

-- 
Justin

Re: Fwd: temp_file_limit?

2022-12-19 Thread Justin Pryzby

On Mon, Dec 19, 2022 at 06:27:57PM +0100, Frits Jalvingh wrote:
> I have listed the files during that run,

> 213M -rw--- 1 postgres postgres 213M dec 19 17:46 i100of128.p0.0
> 207M -rw--- 1 postgres postgres 207M dec 19 17:46 i100of128.p1.0
> 210M -rw--- 1 postgres postgres 210M dec 19 17:49 i100of256.p0.0
> 211M -rw--- 1 postgres postgres 211M dec 19 17:49 i100of256.p1.0
> 188M -rw--- 1 postgres postgres 188M dec 19 17:53 i100of512.p0.0
[...]

I think that proves Thomas' theory.  I'm not sure how that helps you,
though...

On Mon, Dec 19, 2022 at 01:51:33PM +1300, Thomas Munro wrote:
> One possibility is that you've hit a case that needs several rounds of
> repartitioning (because of a failure to estimate the number of tuples
> well), but we can't see that because you didn't show EXPLAIN (ANALYZE)
> output (understandably if it runs out of disk space before
> completing...).  The parallel hash code doesn't free up the previous
> generations' temporary files; it really only needs two generations'
> worth concurrently (the one it's reading from and the one it's writing
> to).  In rare cases where more generations are needed it could unlink
> the older ones -- that hasn't been implemented yet.  If you set
> log_temp_files = 0 to log temporary file names, it should be clear if
> it's going through multiple rounds of repartitioning, from the names
> (...of32..., ...of64..., ...of128..., ...of256..., ...).

-- 
Justin Pryzby
System Administrator
Telsasoft
+1-952-707-8581

Re: Fwd: temp_file_limit?

2022-12-19 Thread Justin Pryzby

On Mon, Dec 19, 2022 at 05:57:42PM +0100, Frits Jalvingh wrote:
> @justin:
> 
> Ran the query again. Top shows the following processes:
>PID USER  PR  NIVIRTRESSHR S  %CPU  %MEM TIME+

Thanks

> root@chatelet:/d2/var/lib/postgresql/15/main/base# du --max=2 -mx
> ./pgsql_tmp |sort -nr
> 412021 ./pgsql_tmp/pgsql_tmp650830.3.fileset
> 412021 ./pgsql_tmp
> ^^^ a few seconds after this last try the query aborted:
> ERROR:  temporary file size exceeds temp_file_limit (104857600kB)
> 
> One possibility is that there are files in the tmpdir, which have been
> > unlinked, but are still opened, so their space hasn't been reclaimed.
> > You could check for that by running lsof -nn |grep pgsql_tmp Any deleted
> > files would say things like 'DEL|deleted|inode|no such'
>
> I do not really understand what you would like me to do, and when. The disk
> space is growing, and it is actual files under pgsql_tmp?

Run this during the query as either postgres or root:
| lsof -nn |grep pgsql_tmp |grep -E 'DEL|deleted|inode|no such'

Any files it lists would be interesting to know about.

> Hope this tells you something, please let me know if you would like more
> info, and again - thanks!

I think Thomas' idea is more likely.  We'd want to know the names of
files being written, either as logged by log_temp_files or from 
| find pgsql_tmp -ls
during the query.

-- 
Justin

Re: Fwd: temp_file_limit?

2022-12-18 Thread Justin Pryzby

On Sun, Dec 18, 2022 at 06:29:41PM +0100, Frits Jalvingh wrote:
> Just to be clear: my real question is: why is temp_file_limit not
> working at the specified size? Because this is my real problem: when a
> query is dying like this it will also kill other queries because these
> are also running out of space. Even when the limit is per-process it
> should not have exceeded 200GB imo.

What OS and filesystem are in use ?

Could you list the tmpdir when it's getting huge?  The filenames include
the PID, which would indicate if there's another procecss involved, or a
bug allowed it to get huge.
sudo du --max=2 -mx ./pgsql_tmp |sort -nr

BTW, pg_ls_tmpdir() hides directories, so you shouldn't rely on it for
listing temporary directories...

One possibility is that there are files in the tmpdir, which have been
unlinked, but are still opened, so their space hasn't been reclaimed.
You could check for that by running lsof -nn |grep pgsql_tmp Any deleted
files would say things like 'DEL|deleted|inode|no such'

> BTW, if that limit is really per process instead of per
> session/query then that is a Very Bad Thing(tm), because this makes the
> limit effectively worthless - if a query can spawn 8 parallel processes
> then you can suddenly, without any form of control, again fill up that disk.

8 is the default value of max_worker_processes and max_parallel_workers,
but 2 is the default value of max_parallel_workers_per_gather.  You're
free the change the default value to balance it with the temp_file_limit
(as suggested by the earlier-mentioned commit).

-- 
Justin

Re: temp_file_limit?

2022-12-18 Thread Justin Pryzby

On Sun, Dec 18, 2022 at 12:48:03PM +0100, Frits Jalvingh wrote:
> Hi list,
> 
> I have a misbehaving query which uses all available disk space and then
> terminates with a "cannot write block" error. To prevent other processes
> from running into trouble I've set the following:
> 
> temp_file_limit = 100GB

> The comment in the file states that this is a per-session parameter, so
> what is going wrong here?

Do you mean the comment in postgresql.conf ?

commit d1f822e58 changed to say that temp_file_limit is actually
per-process and not per-session.

Could you send the query plan, preferably "explain analyze" (if the
query finishes sometimes) ?

log_temp_files may be helpful here.

> The query does parallelize and uses one parallel worker while executing,
> but it does not abort when the temp file limit is reached:
> 
> 345G pgsql_tmp
> 
> It does abort way later, after using around 300+ GB:
> [53400] ERROR: temporary file size exceeds temp_file_limit (104857600kB)
> Where: parallel worker

Are you sure the 345G are from only one instance of the query ?
Or is it running multiple times, or along with other queries writing
100GB of tempfiles.

It seems possible that it sometimes runs with more than one parallel
worker.  Also, are there old/stray tempfiles there which need to be
cleaned up?

-- 
Justin

Re: Postgres12 looking for possible HashAggregate issue workarounds?

2022-12-16 Thread Justin Pryzby

On Fri, Dec 16, 2022 at 03:24:17PM +, João Paulo Luís wrote:
> Hi! Sorry to post to this mailing list, but I could not find many tips 
> working around HashAggregate issues.
> 
> In a research project involving text repetition analysis (on top of public 
> documents)
> I have a VirtualMachine (CPU AMD Epyc 7502P, 128GB RAM, 12TB HDD, 2TB SSD),
> running postgres 12.12 (Ubuntu 12.12-0ubuntu0.20.04.1)
> and some tables with many rows:

> 1 - the query is making a postgresql project have 76.7 GB resident RAM usage.
> Having a WORK_MEM setting of 2GB (and "simple" COUNT() results),
> that was not expected.
> (I risk oom-killer killing my postgres as soon as I run another concurrent
> query.)

> The rows=261275 on HashAggregate  (cost=26397219.92..26399832.67 rows=261275 
> width=8) seems VERY WRONG!
> I was expecting something like rows=1.0E+09 instead.

> I would guess that HashAggregate is behaving very badly (using to much RAM 
> beyond WORK_MEM, amd also badly estimating the #rows and taking forever...)

Huge memory use sounds like what was fixed in postgres 13.

https://www.postgresql.org/docs/13/release-13.html

Allow hash aggregation to use disk storage for large aggregation result
sets (Jeff Davis)

Previously, hash aggregation was avoided if it was expected to use more
than work_mem memory. Now, a hash aggregation plan can be chosen despite
that. The hash table will be spilled to disk if it exceeds work_mem
times hash_mem_multiplier.

This behavior is normally preferable to the old behavior, in which once
hash aggregation had been chosen, the hash table would be kept in memory
no matter how large it got — which could be very large if the planner
had misestimated. If necessary, behavior similar to that can be obtained
by increasing hash_mem_multiplier.

-- 
Justin

Re: Odd Choice of seq scan

2022-12-01 Thread Justin Pryzby

On Fri, Dec 02, 2022 at 11:52:19AM +1100, Paul McGarry wrote:
> Hi there,
> 
> I'm wondering if anyone has any insight into what might make the database
> choose a sequential scan for a query (table defs and plan below) like :

> Plan - seq scan of table:
> =
> > explain select orders.orderid FROM orders WHERE (orders.orderid IN 
> > ('546111') OR orders.orderid IN  (select orderid FROM orderstotrans WHERE 
> > (transid IN ('546111';

> Plan - Seq scan and filter of index:
> =
> > explain select orders.orderid FROM orders WHERE (orders.orderid IN 
> > ('546111') OR orders.orderid IN  (select orderid FROM orderstotrans WHERE 
> > (transid IN ('546111';

Could you show explain analyze ?

Show the size of the table and its indexes 
And GUC settings
And the "statistics" here:
https://wiki.postgresql.org/wiki/Slow_Query_Questions#Statistics:_n_distinct.2C_MCV.2C_histogram
Maybe on both a well-behaving instance and a badly-beving instance.

-- 
Justin

Re: Catching up with performance & PostgreSQL 15

2022-11-28 Thread Justin Pryzby

On Mon, Nov 28, 2022 at 06:59:41PM -0800, Josh Berkus wrote:
> Hey, folks:
> 
> I haven't configured a PostgreSQL server since version 11 (before that, I
> did quite a few).
> 
> What's changed in terms of performance configuration since then?  Have the
> fundamentals of shared_buffers/work_mem/max_connections changed at all?
> Which new settings are must-tunes?
> 
> I've heard about new parallel stuff an JIT, but neither is that applicable
> to my use-case.

shared buffers is the same, but btree indexes are frequently (IME) 3x
smaller (!) since deduplication was added in v13, so s_b might not need
to be as large.

In addition to setting work_mem, you can also (since v13) set
hash_mem_multiplier.

default_toast_compression = lz4 # v14
recovery_init_sync_method = syncfs # v14
check_client_connection_interval = ... # v14
wal_compression = {lz4,zstd} # v15

Peeking at my notes, there's also: partitioning, parallel query, brin
indexes, extended statistics, reindex concurrently, ...

... but I don't think anything is radically changed :)

-- 
Justin

Re: Query is sometimes fast and sometimes slow: what could be the reason?

2022-09-14 Thread Justin Pryzby

On Wed, Sep 14, 2022 at 05:02:07PM +0200, tias...@gmx.de wrote:
> What could be the reason of a query, which is sometimes fast 
> and sometimes slow (factor 10x)?
> (running on a large table).
> 
> 

Lots of possible issues.  Is it using a different query plan ?
Collect a good plan and a bad one and compare, or send both.
Perhaps use autoexplain to do so.

Turn on logging and send as much information as you can as described
here.
https://wiki.postgresql.org/wiki/Slow_Query_Questions

Please try to configure your mail client to send text mail (instead of
or in addition to the html one).

-- 
Justin

Re: Postgresql JDBC process consumes more memory than psql client

2022-09-06 Thread Justin Pryzby

On Tue, Sep 06, 2022 at 04:15:03AM +, James Pang (chaolpan) wrote:
> We make 2 comparisions between partitioned(256 HASH) and no-partitioned(same 
> data volume,same table attributes) , do same  "UPDATE,DELETE " .
>  1. with partitioned tables , the "RES" from top command memory increased 
> quickly to 160MB and keep stable there. 
>   From auto_explain trace, we did saw  partition pruning to specific 
> partition when execution the prepared sql statement by Postgresql JDBC .
> 2. with no-partitioned tables, the "RES" from top command memory only keep 
> 24MB stable there. 
>Same auto_explain , and only table and index scan there by prepared 
> sql statement by Postgresql JDBC. 
> 3. with psql client , run the UPDATE/DELETE sql locally,  partition pruning 
> works and the "RES" memory" is much less, it's about 9MB . 
> 
> Yesterday, when workload test, a lot of Postgresql JDBC connections use 
> 150-160MB memory , so we got ERROR: out of memory

How many JDBC clients were there?

Did you use the same number of clients when you used psql ?
Otherwise it wasn't a fair test.

Also, did you try using psql with PREPARE+EXECUTE ?  I imagine memory
use would match JDBC.

It's probably not important, but if you set the log level high enough,
you could log memory use more accurately using log_executor_stats
(maxrss).

> So, looks like something with Postgresql JDBC driver lead to the high memory 
> consumption when table is partitioned , even when table is no partitioned , 
> compared with psql client, it consumes more memory.   Any suggestions to tune 
> that ?  PG V13 , OS RHEL8 , Virtua machine on VMWARE. We make 
> shared_buffers=36% physical memory ,  effective_cache_size=70%physical memory 
> , total physical memory is about 128GB.

I sent this before hoping to get answers to all the most common
questions earlier, rather than being spread out over the first handful
of emails.

https://wiki.postgresql.org/wiki/Slow_Query_Questions

version 13 point what ?
what are the other non-default gucs ?
what are the query plans ?

-- 
Justin

Re: Postgresql JDBC process consumes more memory than psql client

2022-09-05 Thread Justin Pryzby

On Mon, Sep 05, 2022 at 12:52:14PM +, James Pang (chaolpan) wrote:
> Any idea how to print SQL plan from JDBC driver ? 

You could use "explain execute" on the client, or autoexplain on the
server-side.

-- 
Justin

Re: Postgresql JDBC process consumes more memory than psql client

2022-09-05 Thread Justin Pryzby

On Mon, Sep 05, 2022 at 12:40:46PM +, James Pang (chaolpan) wrote:
>   We run same update or delete SQL statement " DELETE FROM ... WHERE ... "  
> the table is a hash partition table (256 hash partitions). When run the sql 
> from Postgresql JDBC driver, it soon increased to 150MB memory (RES filed 
> from top command),   but when run the same SQL from psql , it only 
> consumes about 10MB memory.  UPDATE statements is similar , need 100MB 
> memory, even it delete or update 0 rows.  Any specific control about 
> Postgresql JDBC driver ?

It sounds like JDBC is using prepared statements, and partitions maybe
weren't pruned by the server.  What is the query plan from psql vs from
jdbc ?

https://wiki.postgresql.org/wiki/Slow_Query_Questions

What version is the postgres server ?
That affects pruning as well as memory use.

https://www.postgresql.org/docs/14/release-14.html
Improve the performance of updates and deletes on partitioned tables
with many partitions (Amit Langote, Tom Lane)

This change greatly reduces the planner's overhead for such cases, and
also allows updates/deletes on partitioned tables to use execution-time
partition pruning.

Actually, this is about the same response as when you asked in June,
except that was about UPDATE.
https://www.postgresql.org/message-id/ph0pr11mb519134d4171a126776e3e063d6...@ph0pr11mb5191.namprd11.prod.outlook.com

-- 
Justin

Re: to_jsonb performance on array aggregated correlated subqueries

2022-08-12 Thread Justin Pryzby

On Fri, Aug 12, 2022 at 07:02:36PM +, Nico Heller wrote:
> I knew I forgot something: We are currently on 13.6. When was this issue
> fixed?

There's a WIP/proposed fix, but the fix is not released.
I asked about your version because jit was disabled by default in v11.
But it's enabled by default in v12.

https://wiki.postgresql.org/wiki/PostgreSQL_15_Open_Items#Older_bugs_affecting_stable_branches

-- 
Justin

Re: to_jsonb performance on array aggregated correlated subqueries

2022-08-12 Thread Justin Pryzby

What version of postgres ?

I wonder if you're hitting the known memory leak involving jit.
Try with jit=off or jit_inline_above_cost=-1.

-- 
Justin

Re: Postgresql 14 partitioning advice

2022-08-08 Thread Justin Pryzby

On Mon, Aug 08, 2022 at 03:45:11PM -0700, Slava Mudry wrote:
> Postgres 14 improved partitioning quite a bit. I used it in Postgres 9 and
> there was a lot of locking on partition hierarchy when you add/drop
> partition tables.

Note that postgres 9 didn't have native/declarative partitioning, and most
improvements in native partitioning don't apply to legacy/inheritance
partitioning.

https://www.postgresql.org/docs/devel/ddl-partitioning.html

"Native" partitioning added in v10 tends to require stronger locks for add/drop
than legacy partitioning, since partitions have associated bounds, which cannot
overlap.  The locking is improved in v12 with CREATE+ATTACH and v14 with
DETACH CONCURRENTLY+DROP.

-- 
Justin

Re: pg_wal filling up while running huge updates

2022-08-05 Thread Justin Pryzby

On Fri, Aug 05, 2022 at 06:00:02PM +0530, aditya desai wrote:
> Hi,
> We are doing an oracle to postgres migration(5 TB+ data). We are encoding
> and decoding BLOB data after migration and for that we are running updates
> on tables having BLOB/CLOB data. When we execute this pg_wal is filling up.

Could you please include basic information in each new thread you create ?

https://wiki.postgresql.org/wiki/Server_Configuration
https://wiki.postgresql.org/wiki/Guide_to_reporting_problems

-- 
Justin

Re: Postgresql 14 partitioning advice

2022-07-27 Thread Justin Pryzby

On Wed, Jul 27, 2022 at 08:55:14AM -0400, Rick Otten wrote:
> I'm spinning up a new Postgresql 14 database where I'll have to store a
> couple years worth of time series data at the rate of single-digit millions
> of rows per day.  Since this has to run in AWS Aurora, I can't use
> TimescaleDB.

> One person I talked to said "try not to have more than 100 partitions",
> even with the latest postgresql you'll end up with a lot of lock contention
> if you go over 100 partitions.

I'm not familiar with this (but now I'm curious).  We have over 2000 partitions
in some tables.  No locking issue that I'm aware of.  One issue that I *have*
seen is if you have many partitions, you can end up with query plans with a
very large number of planner nodes, and it's hard to set
work_mem*hash_mem_multiplier to account for that.

> This person also recommended manually
> kicking off vacuums on a regular schedule rather than trusting autovacuum
> to work reliably on the partitioned tables.

They must mean *analyze*, which does not run automatically on the partitioned
tables (only the partitions).  The partitioned table is empty, so doesn't need
to be vacuumed.

> I've got several keys, besides the obvious time-key that I could partition
> on.   I could do a multi-key partitioning scheme.  Since the data is
> inbound at a relatively steady rate, if I partition on time, I can adjust
> the partitions to be reasonably similarly sized.  What is a good partition
> size?

Depends on 1) the target number of partitions; and 2) the target size for
indexes on those partitions.  More partition keys will lead to smaller indexes.
Depending on the type of index, and the index keys, to get good INSERT
performance, you may need to set shared_buffers to accommodate the sum of size
of all the indexes (but maybe not, if the leading column is timestamp).

> Since the data most frequently queried would be recent data (say the past
> month or so) would it make sense to build an archiving strategy that rolled
> up older partitions into larger ones?  ie, do daily partitions for the
> first four weeks, then come up with a process that rolled them up into
> monthly partitions for the next few months, then maybe quarterly partitions
> for the data older than a year?  (I'm thinking about ways to keep the
> partition count low - if that advice is justified.)

I think it can make sense.  I do that myself in order to: 1) avoid having a
huge *total* number of tables (which causes pg_attribute to be large, since our
tables are also "wide"); and 2) make our backups of "recent data" smaller; and
3) make autoanalyze a bit more efficient (a monthly partition will be analyzed
numerous times the 2nd half of the month, even though all the historic data
hasn't changed at all).

> Or, should I just have a single 7 Trillion row table with a BRIN index on
> the timestamp and not mess with partitions at all?

Are you going to need to DELETE data ?  Then this isn't great, and DELETEing
data will innevitably cause a lower correlation, making BRIN less effective.

-- 
Justin

Re: PostgresSQL 9.5.21 very slow to connect and perform basic queries

2022-07-21 Thread Justin Pryzby

On Thu, Jul 21, 2022 at 03:59:30PM -0400, bruno da silva wrote:
>   OS/version: CentOS release 6.9 (Final)

How are these set ?

tail /sys/kernel/mm/ksm/run 
/sys/kernel/mm/transparent_hugepage/{defrag,enabled,khugepaged/defrag} 
/proc/sys/vm/zone_reclaim_mode

I suspect you may be suffering from issues with transparent huge pages.

I suggest to disable KSM and THP, or upgrade to a newer OS.

I've written before about these:
https://www.postgresql.org/message-id/20170524155855.gh31...@telsasoft.com
https://www.postgresql.org/message-id/20190625162338.gf18...@telsasoft.com
https://www.postgresql.org/message-id/20170718180152.ge17...@telsasoft.com
https://www.postgresql.org/message-id/20191004060300.ga11...@telsasoft.com
https://www.postgresql.org/message-id/20200413144254.gs2...@telsasoft.com
https://www.postgresql.org/message-id/20220329182453.ga28...@telsasoft.com

On Thu, Jul 21, 2022 at 04:01:10PM -0400, bruno da silva wrote:
> The issue started a month ago.

Ok .. but how long has the DB been running under this environment ?

-- 
Justin

Re: PostgresSQL 9.5.21 very slow to connect and perform basic queries

2022-07-21 Thread Justin Pryzby

On Thu, Jul 21, 2022 at 02:37:35PM -0400, bruno da silva wrote:
> I'm investigating an issue on a PostgresSql 9.5.21 installation that
> becomes unusable in an intermittent way. Simple queries like "select
> now();" could take 20s. commits take 2s. and all gets fixed after an engine
> restart.
> 
> I look into the pg logs and no signs of errors. and checkpoints are
> always timed. The machine is well provisioned, load isn't too high, and cpu
> io wait is under 1%.
> 
> any suggestions on what I should check more?

What OS/version is it ?

What GUCs have you changed ?

Is it a new issue ?

https://wiki.postgresql.org/wiki/Slow_Query_Questions

Operating system+version
What OS / version ? At least for linux, you can get the distribution by 
running: tail /etc/*release 

GUC Settings
What database configuration settings have you changed? What are their values? 
(These are things like "shared_buffers", "work_mem", "enable_seq_scan", 
"effective_io_concurrency", "effective_cache_size", etc). See Server 
Configuration for a useful query that will show all of your non-default 
database settings, in an easier to read format than posting pieces of your 
postgresql.conf file. 

-- 
Justin

Re: Occasional performance issue after changing table partitions

2022-07-12 Thread Justin Pryzby

On Wed, Jul 13, 2022 at 03:13:46AM +1200, Nathan Ward wrote:
> I have been stepping through the various statements which are different 
> between the two functions, and note that when I do math on a timestamp in a 
> SELECT statement (i.e. _event_timestamp - INTERVAL ‘1 hour’),
> the planner takes 50ms or so - note that the result of the timestamp is used 
> to search the partition key.
> If I declare a function which does the math in advance, stores it in a 
> variable and then runs the SELECT, the planner takes less than 1ms.
> Does this mean it’s calculating the timestamp for each partition, or 
> something like that?

I'm not sure I understand what you're doing - the relevant parts of your
function text and query plan would help here.

Maybe auto_explain.log_nested_statements would be useful ?

Note that "partition pruning" can happen even if you don't have a literal
constant.  For example:
|explain(costs off) SELECT * FROM metrics WHERE start_time > now()::timestamp - 
'1 days'::interval;
| Append
|   Subplans Removed: 36

> I see Postgres 14 release notes has information about performance 
> improvements in the planner for updates on tables with "many partitions”. Is 
> 444 partitions “many”?
> My updates are all impacting a single partition only.

It sounds like that'll certainly help you.  Another option is to update the
partition directly (which is what we do, to be able to use "ON CONFLICT").

I think with "old partitioning with inheritance", more than a few hundred
partitions was considered unreasonable, and plan-time suffered.

With relkind=p native/declarative partitioning, a few hundred is considered
reasonable, and a few thousand is still considered excessive - even if the
planner time is no issue, you'll still run into problems like "work-mem is
per-node", which works poorly when you might have 10x more nodes.

TBH, this doesn't sound related to your original issue.

-- 
Justin

Re: Oracle_FDW table performance issue

2022-07-11 Thread Justin Pryzby

On Mon, Jul 11, 2022 at 05:38:34PM +0530, aditya desai wrote:
> Hi,
> I have one Oracle fdw table which is giving performance issue when joined
> local temp table gives performance issue.
> 
> select * from oracle_fdw_table where transaction_id in ( select
> transaction_id from temp_table)  54 seconds. Seeing HASH SEMI JOIN  in
> EXPLAIN PLAN. temp_table has only 74 records.

You'd have to share the plan

https://wiki.postgresql.org/wiki/Slow_Query_Questions

Do the tables have updated stats ?

Re: Occasional performance issue after changing table partitions

2022-07-10 Thread Justin Pryzby

On Mon, Jul 11, 2022 at 03:21:38PM +1200, Nathan Ward wrote:
> > Note that postgres doesn't automatically analyze parent tables, so you 
> > should
> > maybe do that whenever the data changes enough for it to matter.
> 
> Hmm. This raises some stuff I’m not familiar with - does analysing a parent 
> table do anything?

Yes

You could check if you have stats now (maybe due to a global ANALYZE or
analyzedb) and how the query plans change if you analyze.
The transaction may be overly conservative.

SELECT COUNT(1) FROM pg_stats WHERE tablename=PARENT;
SELECT last_analyze, last_autoanalyze, relname FROM pg_stat_all_tables WHERE 
relname=PARENT;
begin;
SET default_statistics_target=10;
ANALYZE;
explain SELECT [...];
rollback;

> I got the impression that analysing the parent was just shorthand for 
> analysing all of the attached partitions.

Could you let us know if the documentation left that impression ?

See here (this was updated recently).

https://www.postgresql.org/docs/13/sql-analyze.html#id-1.9.3.46.8

For partitioned tables, ANALYZE gathers statistics by sampling rows from all 
partitions; in addition, it will recurse into each partition and update its 
statistics. Each leaf partition is analyzed only once, even with multi-level 
partitioning. No statistics are collected for only the parent table (without 
data from its partitions), because with partitioning it's guaranteed to be 
empty.

By contrast, if the table being analyzed has inheritance children, ANALYZE 
gathers two sets of statistics: one on the rows of the parent table only, and a 
second including rows of both the parent table and all of its children. This 
second set of statistics is needed when planning queries that process the 
inheritance tree as a whole. The child tables themselves are not individually 
analyzed in this case.

The autovacuum daemon does not process partitioned tables, nor does it process 
inheritance parents if only the children are ever modified. It is usually 
necessary to periodically run a manual ANALYZE to keep the statistics of the 
table hierarchy up to date.

> Perhaps because I attach a table with data, the parent sometimes decides it 
> needs to run analyse on a bunch of things?

No, that doesn't happen.

> Or, maybe it uses the most recently attached partition, with bad statistics, 
> to plan queries that only touch other partitions?

This is closer to what I was talking about.

To be clear, you are using relkind=p partitions (added in v10), and not
inheritance parents, right ?

-- 
Justin

Re: Occasional performance issue after changing table partitions

2022-07-10 Thread Justin Pryzby

On Sun, Jul 10, 2022 at 04:55:34PM +1200, Nathan Ward wrote:
> I am running Postgres 13 on CentOS 7, installed from the yum.postgresql.org 
>  repo.

It doesn't sound relevant, but what kind of storage systems is postgres using ?
Filesystem, raid, device.

Is the high CPU use related to to autovacuum/autoanalyze ?

> The issue I am having, is that when the daily data usage aggregation runs, 
> sometimes we have a big performance impact, with the following 
> characteristics which happen *after* the aggregation job runs in it usual 
> fast time of 12s or so:
> - The aggregation runs fast as per normal
> - Load on the server goes to 30-40 - recall we have quite high “max 
> connections” to keep throughput high when the client is far (16ms) from the 
> server

I suggest to install and enable autoexplain to see what's running slowly here,
and what its query plans are.  It seems possible that when the daily
aggregation script drops the old partitions, the plan changes for the worse.
I'm not sure what the fix is - maybe you just need to run vacuum or analyze on
the new partitions soon after populating them.

For good measure, also set log_autovacuum_min_duration=0 (or something other
than -1) (and while you're at it, log_checkpoints=on, and log_lock_waits=on if
you haven't already).

Note that postgres doesn't automatically analyze parent tables, so you should
maybe do that whenever the data changes enough for it to matter.

-- 
Justin

Re: partition pruning only works for select but update

2022-07-01 Thread Justin Pryzby

On Fri, Jul 01, 2022 at 08:30:40AM +, James Pang (chaolpan) wrote:
> We have other application depend on V13, possible to backport code changes to 
> V13 as 
> https://git.postgresql.org/gitweb/?p=postgresql.git;a=patch;h=86dc90056dfdbd9d1b891718d2e5614e3e432f35

Do you mean that the other application needs to be updated to work with v14?
Or that you haven't checked yet if they work with v14?

In any case, I'm sure the feature won't be backpatched to v13 - it's an
improvement but not a bugfix.

-- 
Justin

> -Original Message-
> From: Tom Lane  
> Sent: Tuesday, June 28, 2022 9:30 PM
> To: James Pang (chaolpan) 
> Cc: pgsql-performance@lists.postgresql.org
> Subject: Re: partition pruning only works for select but update
> 
> "James Pang (chaolpan)"  writes:
> > But when
> > Explain update table set .. where  partitionkey between  to_timestamp() and 
> > to_timestamp();
> >   It still show all of partitions with update ...
> 
> In releases before v14, partition pruning is far stupider for UPDATE (and 
> DELETE) than it is for SELECT.

Re: Fluctuating performance of updates on small table with trigger

2022-06-29 Thread Justin Pryzby

On Wed, Jun 29, 2022 at 09:31:58PM +0200, Mikkel Lauritsen wrote:
> In short I'm running PostgreSQL 14.1 on Linux on a small test machine with

should try to upgrade to 14.4, for $reasons

> Is there any feasible way to find out what it is that causes Postgres to
> start doing slow updates? My guess would be a buffer filling up or something
> similar, but the regularity between runs paired with the irregular lengths
> of the fast and slow phases in each run doesn't really seem to fit with
> this.

Set log_checkpoints=on, log_autovacuum_min_duration=0, log_lock_waits=on, and
enable autoexplain with auto_explain.log_nested_statements=on.

Then see what's in the logs when that happens.

@hackers: the first two of those are enabled by default in 15dev, and this
inquiry seems to support that change.

-- 
Justin

Re: Strange behavior of limit clause in complex query

2022-06-08 Thread Justin Pryzby

On Wed, Jun 08, 2022 at 09:44:08AM +0100, Paulo Silva wrote:
> But if I add an ORDER BY and a LIMIT something goes very wrong (Q2):

A somewhat common problem.

A common workaround is to change "ORDER BY a" to something like "ORDER BY a+0"
(if your framework will allow it).

> An EXPLAIN (ANALYZE, BUFFERS) for Q2 returns this:
...
>->  Index Scan Backward using ix_ng_content_date on ng_content 
> "Extent1"  (cost=0.43..40616715.85 rows=2231839 width=12) (actual 
> time=11027.808..183839.289 rows=5 loops=1)
>  Filter: ((2 = id_status) AND (date_from <= 
> LOCALTIMESTAMP) AND (date_to >= LOCALTIMESTAMP) AND (SubPlan 1))
>  Rows Removed by Filter: 4685618
>  Buffers: shared hit=15414533 read=564480 written=504

I'm not sure if it would help your original issue, but the rowcount estimate
here is bad - overestimating 2231839 rows instead of 5.

Could you try to determine which of those conditions (id_status, date_from,
date_to, or SubPlan) causes the mis-estimate, or if the estimate is only wrong
when they're combined ?

-- 
Justin

Re: Query is taking too long i intermittent

2022-06-06 Thread Justin Pryzby

On Mon, Jun 06, 2022 at 03:28:43PM +0530, Mayank Kandari wrote:
> 

Thanks for including the link*.

(*FYI, I find it to be kind of unfriendly to ask the same question in multiple
forums, simultaneously - it's like cross-posting.  The goal seems to be to
demand an answer from the internet community as quickly as possible.)

> Indexes:
> "event_pkey" PRIMARY KEY, btree (event_id, pkey)
> "event_event_sec_event_usec_idx" btree (event_sec, event_usec)
> When I execute the following query it takes 1 - 2 milliseconds to execute.

> I am using libpq to connect the Postgres server in c++ code. Postgres
> server version is 12.10
> Time is provided as a parameter to function executing this query, it
> contains epoche seconds and microseconds.

Are you using the simple query protocol or the extended protocol ?

> This query is executed every 30 seconds on the same client connection
> (Which is persistent for weeks). This process runs for weeks, but some time
> same query starts taking more than 10 minutes. Once it takes 10 minutes,
> after that every execution takes > 10 minutes.

> If I restart the process it recreated connection with the server and now
> execution time again falls back to 1-2 milliseconds. This issue is
> intermittent, sometimes it triggers after a week of the running process and
> sometime after 2 - 3 weeks of the running process.

Could you get the query plan for the good vs bad executions ?

To get the "bad" plan, I suggest to enable auto-explain and set its min
duration to 10 seconds or 1 minute.  The "good" plan you can get any time from
psql.

> SELECT event_id FROM event WHERE (event_sec > time.seconds) OR
> ((event_sec=time.seconds) AND (event_usec>=time.useconds) ORDER BY
> event_sec, event_usec LIMIT 1

I think it'd be better if the column was a float storing the fractional number
of seconds.  Currently, it may be hard for the planner to estimate rowcounts if
the conditions are not independent.  I don't know if it's related to this
problem, though.

Re: postgres backend process hang on " D " state

2022-05-29 Thread Justin Pryzby

On Mon, May 30, 2022 at 01:19:56AM +, James Pang (chaolpan) wrote:
> 1. extensions 
>   shared_preload_libraries = 
> 'orafce,pgaudit,pg_cron,pg_stat_statements,set_user'
> 2.  psql can not login now ,it hang there too, so can not check anything from 
> pg_stats_* views
> 3.  one main app user and 2 schemas ,no long running transactions . 
> 4. we use /pgdata , it's on xfs , lvm/vg RHEL8.4 ,it's a shared storage, no 
> use root filesystem.
> /dev/mapper/pgdatavg-pgdatalv 500G  230G  271G  46% /pgdata
> /dev/mapper/pgdatavg-pgarchivelv  190G  1.5G  189G   1% /pgarchive
> /dev/mapper/pgdatavg-pgwallv  100G   34G   67G  34% /pgwal

What are the LVM PVs ?  Is it a scsi/virt device ?  Or iscsi/drbd/???

I didn't hear back if there's any kernel errors.
Is the storage broken/stuck/disconnected ?
Can you run "time find /pgdata /pgarchive /pgwal -ls |wc" ?

Could you run "ps -u postgres -O wchan="

Can you strace one of the stuck backends ?

It sounds like you'll have to restart the service or VM (forcibly if necessary)
to resolve the immediate issue and then collect the other info, and leave a
"psql" open to try to (if the problem recurs) check pg_stat_activity and other
DB info.

Re: postgres backend process hang on " D " state

2022-05-29 Thread Justin Pryzby

On Sun, May 29, 2022 at 01:20:12PM +, James Pang (chaolpan) wrote:
>We have a performance test on Postgresql 13.4 on RHEL8.4 , just after 
> connection storm in ( 952 new connections coming in 1 minute),  a lot of 
> backends start on " D " state,  and when more sessions got disconnected, they 
> do not exit successfully, instead became  "defunct".   No errors from 
> postgresql.log , just after the connection storm, some pg_cron workers can 
> not started either.  The server is a Virtual machine and no IO hang 
> (actually) IO load is very low.   Could be a postgresql bug or an OS abnormal 
> behavior?

What settings have you set ?
https://wiki.postgresql.org/wiki/Server_Configuration

What extensions do you have loaded?  \dx

Send the output of SELECT * FROM pg_stat_activity either as an attachment or in
\x mode?

What is your data dir ?  Is it on the VM's root filesystem or something else ?
Show the output of "mount".  Are there any kernel messages in /var/log/messages
or `dmesg` ?

How many relations are in your schema ?
Are you using temp tables ?
Long-running transactions ?

-- 
Justin

Re: How to monitor Postgres real memory usage

2022-05-27 Thread Justin Pryzby

On Sat, May 28, 2022 at 01:40:14AM +0800, 徐志宇徐 wrote:
> vm.swappiness=0

I think this is related to the problem.

swappiness=0 means to *never* use swap, even if that means that processes are
killed.

If you really wanted that, you should remove the swap space.

Swap is extremely slow and worth avoiding, but this doesn't let you use it at
all.  You can't even look at your swap usage as a diagnostic measure to tell if
things had been paged out at some point.

I *suspect* the problem will go away if you set swappiness=1 in /proc (and in
sysctl.conf).

-- 
Justin

Re: How to monitor Postgres real memory usage

2022-05-26 Thread Justin Pryzby

On Fri, May 27, 2022 at 01:39:15AM +0800, 徐志宇徐 wrote:
> Hi Justin
> 
>   Thanks for you explaination.
> 
>   > > What postgres version ?
>   > > How was it installed ?  From souce?  From a package ?
>   I am using Postgres 11.1 .It's installed by package.

This is quite old, and missing ~4 years of bugfixes.

What's the output of these commands?
tail /proc/sys/vm/overcommit_*
tail /proc/sys/vm/nr_*hugepages /proc/cmdline
cat /proc/meminfo
uname -a

-- 
Justin

Re: How to monitor Postgres real memory usage

2022-05-26 Thread Justin Pryzby

> enable_seqscan = 'off'

Why is this here ?  I think when people set this, it's because they "want to
use more index scans to make things faster".  But index scans aren't
necessarily faster, and this tries to force their use even when it will be
slower.  It's better to address the queries that are slow (or encourage index
scans by decreasing random_page_cost).

> maintenance_work_mem = '64MB'
> autovacuum_max_workers = '20'
> vacuum_cost_limit = '2000'
> autovacuum_vacuum_scale_factor = '0.0002'
> autovacuum_analyze_scale_factor = '0.1'

This means you're going to use up to 20 processes simultaneously running vacuum
(each of which may use 64MB memory).  What kind of storage does the server
have?  Can it support 20 background processes reading from disk, in addition to
other processs ?

Justin Pryzby  于2022年5月25日周三 01:40写道：
> > What postgres version ?
> > How was it installed ?  From souce?  From a package ?

What about this ?

I'm not sure how/if this would affect memory allocation, but if the server is
slow, processes will be waiting longer, rather than completing quickly, and
using their RAM for a longer period...

Does the postgres user have any rlimits set ?

Check:
ps -fu postgres
# then:
sudo cat /proc/2948/limits

Re: How to monitor Postgres real memory usage

2022-05-24 Thread Justin Pryzby

On Wed, May 25, 2022 at 12:25:28AM +0800, 徐志宇徐 wrote:
> Hi All
> 
>   I am a Database DBA. I focus on PostgreSQL and DB2.
>   Recently. I experience some memory issue. The postgres unable allocate
> memory. I don't know how to monitor Postgres memory usage.

Postgres is just an OS Process, so should be monitored like any other.

What OS are you using ?

Know that the OS may attribute "shared buffers" to different processes, or
multiple processes.

>  This server have 16G memory. On that time. The free command display only 3
> G memory used. The share_buffers almost 6G.
> 
>  On that time. The server have 100 active applications.
>  New connection failed. I have to kill some application by os command "kill 
> -9"

It's almost always a bad idea to kill postgres with kill -9.

> The checkpoint command execute very slow. almost need 5-10 seconds.

Do you mean an interactive checkpoint command ?
Or logs from log_checkpoint ?

>   Is there any useful command to summary PostgreSQL memory usage ?

You can check memory use of an individual query with "explain (analyze,buffers) 
.."
https://wiki.postgresql.org/wiki/Slow_Query_Questions

What settings have you used in postgres ?
https://wiki.postgresql.org/wiki/Server_Configuration

What postgres version ?
How was it installed ?  From souce?  From a package ?

-- 
Justin

Re: DB connection issue suggestions

2022-05-12 Thread Justin Pryzby

If the problem occurs gradually (like leaking 20 connections per hour during
ETL), you can check pg_stat_activity every hour or so to try to observe the
problem before all the connection slots are used up, to collect diagnostic
information.

Alternately, leave a connection opened to the DB and wait until all connection
slots *are* used up, and then check pg_stat_activity.  That will take longer,
and you'll have more information to weed through.

What messages are in the server's log ?

v11.2 is years old and hundreds of bugfixes behind.  Since you ran into this
problem anyway, why not run 11.16, which was released today ?

How did you install postgres 11 ?  From source or from packages ?  Which
packages ?  The semi-official PGDG RPM packages are available here:
https://yum.postgresql.org/

-- 
Justin

Re: DB connection issue suggestions

2022-05-11 Thread Justin Pryzby

On Wed, May 11, 2022 at 09:52:10AM +0800, Sudhir Guna wrote:
> Hi Justin,
> 
> Thank you for reviewing.
> 
> I have tried to run the below query and could see only less than 5
> connections active when I get this error. The total rows I see is only 10
> including idle and active sessions for this output.

That doesn't sound right.  Are you sure you're connecting to the correct
instance ?  Are there really only 5 postgres processes on the server, and fewer
than 5 connections to its network port or socket ?

You didn't provide any other info like what OS this is.

-- 
Justin

Re: DB connection issue suggestions

2022-05-10 Thread Justin Pryzby

On Wed, May 11, 2022 at 12:59:01AM +0800, Sudhir Guna wrote:
>  Dear All,
> 
> We have recently upgraded Postgresql 9.4 standalone server to Postgresql
> 11.2 with High Availability (2 servers : Master and Standby).
> 
> While trying to test using ETL applications and reports, we observe that
> the ETL jobs fails with below error,
> 
> 2022/05/06 16:27:36 - Error occurred while trying to connect to the database
> 2022/05/06 16:27:36 - Error connecting to database: (using class
> org.postgresql.Driver)
> 2022/05/06 16:27:36 - FATAL: Sorry, too many clients already
> 
> We have increased the max_connections = 1000 in postgresql.conf file.
> 
> It worked ok for a day and later we get the same error message.
> 
> Please help to advise on any additional settings required. The prior
> Postgresql 9.4 had the default max_connections = 100 and the applications
> worked fine.

It sounds like at least one thing is still running, perhaps running very
slowly.

You should monitor the number of connections to figure out what.

If you expect to be able to run with only 100 connections, then when
connections>200, there's already over 100 connections which shouldn't still be
there.

You could query pg_stat_activity to determine what they're doing - trying to
run a slow query ?  Are all/most of them stuck doing the same thing ?

You should try to provide the information here for the slow query, and for the
rest of your environment.

https://wiki.postgresql.org/wiki/Slow_Query_Questions

-- 
Justin

Re: Query Planner not taking advantage of HASH PARTITION

2022-04-21 Thread Justin Pryzby

On Wed, Apr 20, 2022 at 07:11:37PM -0700, Benjamin Tingle wrote:
> @ the first point about write locks
> I think I had/have a misconception about how inserts work in postgres. It's
> my understanding that postgres will never draft a parallel insert plan for
> any query (except maybe CREATE TABLE AS?)

It's correct that DML (INSERT/UPDATE/DELETE) currently is not run in parallel.
https://www.postgresql.org/docs/current/when-can-parallel-query-be-used.html

> because the process needs to acquire an exclusive access write lock to the
> table it is inserting on.

But this is incorrect - DML acquires a relation lock, but not a strong one.
Multiple processes can insert into a table at once (because the row-excl lock
level is not self-conflicting, to be technical).
https://www.postgresql.org/docs/current/explicit-locking.html

In fact, that's a design requirement.  It's understood that many people would
be unhappy if only one client were able to run UPDATEs at once, and that only a
toy system would acquire a strong lock for DML.

-- 
Justin

Re: significant jump in sql statement timing for on server vs a remote connection

2022-04-19 Thread Justin Pryzby

On Tue, Apr 19, 2022 at 03:00:09PM -0600, Sbob wrote:
> We are debugging a sql performance issue. We have a sql file with 50,000
> simple select statements in it. If I run the file locally it completes in
> less than 15sec.  If I force the local connection to be a tcp/ip connection
> via psql -h and I get approximately the same results, 15 - 16sec.
> 
> 
> However if we move the file to another server in the same network and run
> with a psql -h then it runs for more than 10min. Are there any postgres
> specific issues / settings / connection overhead  we should look at? Or is
> this simply a network issue and fully outside the scope of the postgres
> database?

What OS ?  What kind of authentication are you using ?
Is there a connection pooler involved ?  Did you try like that ?

Did you test how long it takes to run 10k empty statements locally vs remotely ?
time yes 'SELECT;' |head - |psql ... >/dev/null

-- 
Justin

Re: Query Tunning related to function

2022-04-16 Thread Justin Pryzby

On Thu, Apr 14, 2022 at 06:03:33AM +, Kumar, Mukesh wrote:
> We are running the below query in PostgreSQL and its taking approx. 8 to 9 
> sec to run the query.
> 
> Query - 1 ...
> 
> The explain plan and other details are placed at below link for more 
> information. We have checked the indexes on column but in the explain plan it 
> is showing as Seq Scan which we have to find out.
> 
> https://explain.depesz.com/s/Jsiw#stats

There's a list of information to provide in the postgres wiki, and people here
sent you a link to that wiki page on Feb 27, Mar 1, and Apr 12.  Your problem
report is still missing a lot of that information.  Asking for it piece by
piece would be tedious.

-- 
Justin

Re: An I/O error occurred while sending to the backend (PG 13.4)

2022-04-13 Thread Justin Pryzby

On Wed, Apr 13, 2022 at 03:36:19PM +, l...@laurent-hasson.com wrote:
> After a lot of back and forth, someone in IT informed us that the database VM 
> is under a backup schedule using Veeam. Apparently, during the backup window, 
> Veeam creates a snapshot and that takes the VM offline for a couple of 
> minutes… And of course, they scheduled this right at the busiest time of the 
> day for this machine which is during our nightly ETL. Their backup doesn’t 
> perform very week either, which explained why the failure seemed to randomly 
> happen at various points during our ETL (which takes about 2h30mn).
> 
> They moved the schedule out and the issue has not happened again over the 
> past 3 weeks. This looks like it was the root cause and would explain (I 
> think) how the database and the client simultaneously reported a connection 
> timeout.
> 
> Thank you so much for all your help in trying to figure this out and 
> exonerate Postgres.

Great, thanks for letting us know.
This time it wasn't postgres' fault; you're 2 for 3 ;)

One issue I've seen is if a vmware snapshot is taken and then saved for a long
time.  It can be okay if VEEM takes a transient snapshot, copies its data, and
then destroys the snapshot.  But it can be bad if multiple snapshots are taken
and then left around for a long time to use as a backup themselves.

-- 
Justin

Re: HIGH IO and Less CPU utilization

2022-03-31 Thread Justin Pryzby

On Wed, Mar 30, 2022 at 10:17:38AM +0530, Rambabu g wrote:
> Hi Justin,
> 
> Only one query is causing the issue, sharing the def of indexes. Please
> have a look.
> 
> > > There are three indexes defined on the table, each one is around 20 to 
> > > 25GB
> 
>  tp| character varying(2000)   | yes| tp   | extended |
> 
>852 | 00:09:56.131136 | IO  | DataFileRead  | explain
> analyze select distinct  empno  from emp where sname='test' and tp='EMP
> NAME 1'

The server is doing a scan of the large table.
The tp index matches a lot of rows (13e6) which probably aren't clustered, so
it elects to scan the 500GB table each time.

Looking at this in isolation, maybe it'd be enough to create an index on
tp,empno (and maybe drop the tp index).  CREATE INDEX CONCURRENTLY if you don't
want to disrupt other queries.

But This seems like something that should be solved in a better way though ;
like keeping a table with all the necessary "empno" maintained with "INSERT ON
CONFLICT DO NOTHING".  Or a trigger.

Re: HIGH IO and Less CPU utilization

2022-03-29 Thread Justin Pryzby

On Wed, Mar 30, 2022 at 12:52:05AM +0530, Rambabu g wrote:
> > What indexes are defined on this table ?
> > How large are they ?
>
> There are three indexes defined on the table, each one is around 20 to 25GB
> and the indexes is create on

Did you mean to say something else after "on" ?

Show the definition of the indexes from psql \d

> postgres=# explain select distinct  empno  from emp where sname='test' and 
> tp='EMP NAME 1'

Is this the only query that's performing poorly ?
You should send explain (analyze,buffers) for the prolematic queries.

> > > Hypervisor vendor: KVM
> >
> > Are KSM or THP enabled on the hypervisor ?

> No, the Ec2 VM is delicate to postgres DB instances only.

Oh, so this is an EC2 and you cannot change the hypervisor itself.

> -bash-4.2$ tail /sys/kernel/mm/ksm/run 
> /sys/kernel/mm/transparent_hugepage/khugepaged/defrag 
> /sys/kernel/mm/transparent_hugepage/enabled 
> /sys/kernel/mm/transparent_hugepage/defrag
...
> ==> /sys/kernel/mm/transparent_hugepage/defrag <==
> [always] madvise never

I doubt it will help, but you could try disabling these.
It's a quick experiment anyway.

Re: HIGH IO and Less CPU utilization

2022-03-29 Thread Justin Pryzby

Hi,

Thanks for providing all this info.

On Tue, Mar 29, 2022 at 11:34:18PM +0530, Rambabu g wrote:
> Hi All,
> 
> We have an issue with high load and IO Wait's but less cpu on postgres
> Database, The emp Table size is around 500GB, and the connections are very
> less.

What indexes are defined on this table ?
How large are they ?

> Red Hat Enterprise Linux Server release 7.9 (Maipo)
>  PostgreSQL 11.15 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 
> 20150623 (Red Hat 4.8.5-44), 64-bit
> 
> shared_buffers=12GB
> work_mem=128MB

> 14428 | 04:45:59.712892 | active  | INSERT INTO target (empno, name)
> SELECT empno, '' AS name FROM (select distinct  empno  from emp where 
> sname='test'
> and tp='EMP NAME 1' LIMIT 10) AS query   ;

Is the only only problem query, or just one example or ??
Are your issues with loading data, querying data or both ?

> -bash-4.2$ iostat -x

It shows that you only have a few filesystems in use.
It's common to have WAL and temp_tablespaces on a separate FS.
That probably wouldn't help your performance at all, but it would help to tell
what's doing I/O.  Is there anything else running on the VM besides postgres ?

You can also check:
SELECT COUNT(1), wait_event FROM pg_stat_activity GROUP BY 2 ORDER BY 1 DESC;

And the pg_buffercache extension:
SELECT COUNT(nullif(isdirty,'f')) dirty, COUNT(1) all, COALESCE(c.relname, 
b.relfilenode::text) FROM pg_buffercache b LEFT JOIN pg_class c ON 
b.relfilenode=pg_relation_filenode(c.oid) GROUP BY 3 ORDER BY 1 DESC,2 DESC 
LIMIT 9;

> Hypervisor vendor: KVM

Are KSM or THP enabled on the hypervisor ?

tail /sys/kernel/mm/ksm/run 
/sys/kernel/mm/transparent_hugepage/khugepaged/defrag 
/sys/kernel/mm/transparent_hugepage/enabled 
/sys/kernel/mm/transparent_hugepage/defrag  
 

-- 
Justin

Re: Performance issue post upgrade on Version 13 - Incorrect Estimation Cost choosing Hash Aggregate-Nested Left Loop Join

2022-03-24 Thread Justin Pryzby

On Tue, Mar 22, 2022 at 12:57:10PM +, Prajna Shetty wrote:
> 1.  We have performed Vacuum/Analyze/Reindex post Upgrade.
> 2.  Tweaked work_mem so it does not spill to Disk. We can Disk Usage But 
> it is still using Hash Aggregate and came down from 5 minutes to 20 seconds. 
> (Expected ~5 seconds). Attached plan after modifying work_mem
> 3.  Disabled Seqcan/ nestedloop
> 4.  Tweaked random_page_cost/seq_page_cost
> 5.  Set default_statistics_target=1000 and then run 
> vacuum(analyze,verbose) on selected tables.
> 6.  We have also tested performance by increasing resources up to 4 vCPU 
> and 32 GB RAM.

Would you provide your current settings ?
https://wiki.postgresql.org/wiki/Server_Configuration

Re: High process memory consumption when running sort

2022-03-23 Thread Justin Pryzby

On Wed, Mar 23, 2022 at 02:42:06PM +, Shai Shapira wrote:
> Hi,
> 
> When running our application, we noticed that some processes are taking a lot 
> of memory ( 10, 15, 20GB or so, of RSS ).
> It is also reproduced when running in psql.

Note that RSS can include shared_buffers read by that backend.
That's a linux behavior, not specific to postgres.  It's what Andres was
describing here:
https://www.postgresql.org/message-id/flat/20201003230149.mtd7fjsjwgii3...@alap3.anarazel.de

You have effective_cache_size = 48GB, so this seems to be working as intended.
(ecc is expected to include data cached not only by postgres but by the OS page
cache, too).

> Memory consumption: ( of case 2, application table, using system_stats )

I'm not sure, but I guess this is just a postgres view of whatever the OS
shows.

> Using top:
>   PID USER  PR  NIVIRTRESSHR S  %CPU %MEM TIME+ COMMAND
> 15298 postgres  20   0   16.8g   1.1g   1.1g S   0.0  1.7   0:02.63 postgres

> PostgreSQL 12.6 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 
> (Red Hat 4.8.5-44), 64-bit
> Linux illin7504 3.10.0-1127.19.1.el7.x86_64 #1 SMP Tue Aug 11 19:12:04 EDT 
> 2020 x86_64 x86_64 x86_64 GNU/Linux

> shared_buffers  | configuration file   | postmaster   
>  | 2097152| 8kB  | 1024
> effective_cache_size| configuration file   | user 
>  | 6291456| 8kB  | 524288
> work_mem| configuration file   | user 
>  | 20480  | kB   | 4096

Re: Using system tables directly takes many hours, using temp tables with no indexes takes a few seconds for geometry_columns view.

2022-03-23 Thread Justin Pryzby

On Wed, Mar 23, 2022 at 09:44:09AM +, Lars Aksel Opsahl wrote:
> Why is temp tables with no indexes much faster system tables with indexes ?

I think the "temp table" way is accidentally faster due to having no
statistics, not because it has no indexes.  If you run ANALYZE, you may hit the
same issue (or, maybe you just need to VACUUM ANALYZE your system catalogs).

-- 
Justin

Re: Optimal configuration for server

2022-03-07 Thread Justin Pryzby

On Mon, Mar 07, 2022 at 08:51:24AM -0300, Luiz Felipph wrote:
> My current problem:
> 
> under heavyload, i'm getting "connection closed" on the application
> level(java-jdbc, jboss ds)

Could you check whether the server is crashing ?

If you run "ps -fu postgres", you can compare the start time ("STIME") of the
postmaster parent process with that of the persistent, auxilliary, child
processes like the checkpointer.  If there was a crash, the checkpointer will
have restarted more recently than the parent process.

The SQL version of that is like:
SELECT date_trunc('second', pg_postmaster_start_time() - backend_start) FROM 
pg_stat_activity ORDER BY 1 DESC LIMIT 1;

-- 
Justin

Re: An I/O error occurred while sending to the backend (PG 13.4)

2022-03-03 Thread Justin Pryzby

On Thu, Mar 03, 2022 at 01:33:08PM -0300, Ranier Vilela wrote:
> Sorry, but this is much more on the client side.

The client is reporting the problem, as is the server.

> Following the logs, it is understood that the client is dropping the
> connection.

The logs show that the client's connection *was* dropped.
And on the server, the same.

> So most likely the error could be from Pentaho or JDBC.
> 
> https://www.geeksforgeeks.org/java-net-socketexception-in-java-with-examples/
> " This *SocketException* occurs on the server-side when the client closed
> the socket connection before the response could be returned over the
> socket."
> 
> I suggest moving this thread to the Pentaho or JDBC support.

We don't know the source of the problem.  I still doubt it's in postgres, but I
don't think it's helpful to blame the client, just because the client reported
the problem.  If the server were to disconnect abruptly, I'd expect the client
to report that, too.

Laurent would just have to start the conversation over (and probably collect
the same diagnostic information anyway).  The client projects could blame
postgres with as much rationale as there is for us to blame the client.

Please don't add confusion here.  I made suggestions for how to collect more
information to better understand the source of the problem, and there's
probably not much else to say without that.

-- 
Justin

Re: OOM killer while pg_restore

2022-03-03 Thread Justin Pryzby

On Thu, Mar 03, 2022 at 09:59:03AM +0100, Marc Rechté wrote:
> Hello,
> 
> We have a pg_restore which fails due to RAM over-consumption of the
> corresponding PG backend, which ends-up with OOM killer.
> 
> The table has one PK, one index, and 3 FK constraints, active while restoring.

Send the schema for the table, index, and constraints (\d in psql).

What are the server settings ?
https://wiki.postgresql.org/wiki/Server_Configuration

What OS/version ?

> The dump contains over 200M rows for that table and is in custom format,
> which corresponds to 37 GB of total relation size in the original DB.
> 
> While importing, one can see the RSS + swap increasing linearly for the
> backend (executing the COPY)
> 
> On my machine (quite old PC), it failed after 16 hours, while the disk usage
> was reaching 26 GB and memory usage was 9.1g (RSS+swap)

Re: An I/O error occurred while sending to the backend (PG 13.4)

2022-03-01 Thread Justin Pryzby

On Tue, Mar 01, 2022 at 04:28:31PM +, l...@laurent-hasson.com wrote:
> Now, there is an additional component I think... Storage is on an array and I 
> am not getting a clear answer as to where it is  Is it possible that 
> something is happening at the storage layer? Could that be reported as a 
> network issue vs a storage issue for Postgres?

No.  If there were an error with storage, it'd be reported as a local error,
and the query would fail, rather than failing with client-server communication.

> Also, both machines are actually VMs. I forgot to mention that and not sure 
> if that's relevant.

Are they running on the same hypervisor ?  Is that hyperv ?
Lacking other good hypotheses, that does seem relevant.

-- 
Justin

Re: An I/O error occurred while sending to the backend (PG 13.4)

2022-02-28 Thread Justin Pryzby

On Mon, Feb 28, 2022 at 09:43:09PM +, l...@laurent-hasson.com wrote:
>On Wed, Feb 23, 2022 at 07:04:15PM -0600, Justin Pryzby wrote:
>>  > And the aforementioned network trace.  You could set a capture filter 
> on TCP
>>  > SYN|RST so it's not absurdly large.  From my notes, it might look 
> like this:
>>  > (tcp[tcpflags]&(tcp-rst|tcp-syn|tcp-fin)!=0)
>>  
>>  I'd also add '|| icmp'.  My hunch is that you'll see some ICMP (not 
> "ping")
>>  being sent by an intermediate gateway, resulting in the connection being
>>  reset.
> 
> I am so sorry but I do not understand what you are asking me to do. I am 
> unfamiliar with these commands. Is this a postgres configuration file? Is 
> this something I just do once or something I leave on to hopefully catch it 
> when the issue occurs? Is this something to do on the DB machine or the ETL 
> machine? FYI:

It's no problem.

I suggest that you run wireshark with a capture filter to try to show *why* the
connections are failing.  I think the capture filter might look like:

(icmp || (tcp[tcpflags] & (tcp-rst|tcp-syn|tcp-fin)!=0)) && host 10.64.17.211

With the "host" filtering for the IP address of the *remote* machine.

You could run that on whichever machine is more convenient and leave it running
for however long it takes for that error to happen.  You'll be able to save a
.pcap file for inspection.  I suppose it'll show either a TCP RST or an ICMP.
Whichever side sent that is where the problem is.  I still suspect the issue
isn't in postgres.

>   - My ETL machine is on 10.64.17.211
>   - My DB machine is on 10.64.17.210
>   - Both on Windows Server 2012 R2, x64

These network details make my theory unlikely.

They're on the same subnet with no intermediate gateways, and communicate
directly via a hub/switch/crossover cable.  If that's true, then both will have
each other's hardware address in ARP after pinging from one to the other.

-- 
Justin

Re: slow query to improve performace

2022-02-25 Thread Justin Pryzby

Please provide some more information, like your postgres version and settings.

Some relevant things are included here.

https://wiki.postgresql.org/wiki/Slow_Query_Questions

-- 
Justin

Re: An I/O error occurred while sending to the backend (PG 13.4)

2022-02-25 Thread Justin Pryzby

On Wed, Feb 23, 2022 at 07:04:15PM -0600, Justin Pryzby wrote:
> And the aforementioned network trace.  You could set a capture filter on TCP
> SYN|RST so it's not absurdly large.  From my notes, it might look like this:
> (tcp[tcpflags]&(tcp-rst|tcp-syn|tcp-fin)!=0)

I'd also add '|| icmp'.  My hunch is that you'll see some ICMP (not "ping")
being sent by an intermediate gateway, resulting in the connection being reset.

-- 
Justin

Re: An I/O error occurred while sending to the backend (PG 13.4)

2022-02-24 Thread Justin Pryzby

On Thu, Feb 24, 2022 at 08:50:45AM -0300, Ranier Vilela wrote:
> I can't understand why you are still using 13.4?
> [1] There is a long discussion about the issue with 13.4, the project was
> made to fix a DLL bottleneck.
> 
> Why you not use 13.6?

That other problem (and its fix) were in the windows build environment, and not
an issue in some postgres version.  It's still a good idea to schedule an
update.

-- 
Justin

Re: An I/O error occurred while sending to the backend (PG 13.4)

2022-02-23 Thread Justin Pryzby

You originally mailed about an error on the client, and now you found
corresponding server logs, which suggests a veritable network issue.

Are the postgres clients and server on the same subnet ?  If not, what are the
intermediate routers ?  Is there any NAT happening ?  Do those devices have any
interesting logs that correspond with the server/client connection failures ?

Have you tried enabling TCP keepalives ?  This might help to convince a NAT
device not to forget about your connection.

https://www.postgresql.org/docs/current/runtime-config-connection.html
tcp_keepalives_idle=9
tcp_keepalives_interval=9
tcp_keepalives_count=0
tcp_user_timeout=0 -- You apparently have this set, but it cannot work on 
windows, so just generates noise.

On linux, you can check the keepalive counters in "netstat -not" to be sure
that it's enabled.  A similar switch hopefully exists for windows.

-- 
Justin

Re: An I/O error occurred while sending to the backend (PG 13.4)

2022-02-23 Thread Justin Pryzby

On Thu, Feb 24, 2022 at 12:47:42AM +, l...@laurent-hasson.com wrote:
> On Sat, Dec 04, 2021 at 05:32:10PM +, l...@laurent-hasson.com wrote:
> >  I have a data warehouse with a fairly complex ETL process that has
> >  been running for years now across PG 9.6, 11.2 and now 13.4 for the
> >  past couple of months. I have been getting the error "An I/O error
> >  occurred while sending to the backend" quite often under load in 13.4
> >  which I never used to get on 11.2. I have applied some tricks, 
> particularly
> >  with the socketTimeout JDBC configuration.

>  It'd be nice to see a network trace for this too.  Using tcpdump or
>  wireshark.  Preferably from the client side.
> 
> Hello Justin,
> 
> It has been ages! The issue has been happening a bit more often recently, as 
> much as once every 10 days or so. As a reminder, the set up is Postgres 13.4 
> on Windows Server with 16cores and 64GB memory. The scenario where this 
> occurs is an ETL tool called Pentaho Kettle (V7) connecting to the DB for 
> DataWarehouse workloads. The tool is Java-based and connects via JDBC using 
> postgresql-42.2.5.jar. There are no particular settings besides the 
> socketTimeout setting mentioned above.
> 
> The workload has some steps being lots of quick transactions for dimension 
> tables for example, but some fact table calculations, especially large 
> pivots, can make queries run for 40mn up to over an hour (a few of those).
> 
> I caught these in the logs at the time of a failure but unsure what to make 
> of that:
> 
> 2022-02-21 02:10:43.605 EST [1368] LOG:  unexpected EOF on client connection 
> with an open transaction
> 2022-02-21 02:10:43.605 EST [3304] LOG:  could not receive data from client: 
> An existing connection was forcibly closed by the remote host.
>   
> 2022-02-21 02:10:43.605 EST [3304] LOG:  unexpected EOF on client connection 
> with an open transaction
> 2022-02-21 02:31:38.808 EST [1704] LOG:  setsockopt(TCP_USER_TIMEOUT) not 
> supported

I suggest to enable CSV logging, which has many more columns of data.
Some of them might provide an insight - I'm not sure.
log_destination=csvlog (in addition to whatever else you have set).

And the aforementioned network trace.  You could set a capture filter on TCP
SYN|RST so it's not absurdly large.  From my notes, it might look like this:
(tcp[tcpflags]&(tcp-rst|tcp-syn|tcp-fin)!=0)

-- 
Justin

Re: Slow Running Queries in Azure PostgreSQL

2022-02-22 Thread Justin Pryzby

On Tue, Feb 22, 2022 at 02:11:58PM +, Kumar, Mukesh wrote:

>  ->  Hash Join  (cost=6484.69..43117.63 rows=1 width=198) (actual 
> time=155.508..820.705 rows=52841 loops=1)"
>Hash Cond: (((lms_doc_property_rights_assoc.doc_sid_c)::text = 
> (lms_doc_propright_status_assoc.doc_sid_c)::text) AND 
> ((lms_property_rights_base.property_sid_k)::text = 
> (lms_doc_propright_status_assoc.property_sid_c)::text))"

Your problem seems to start here.  It thinks it'll get one row but actually
gets 53k.  You can join those two tables on their own to understand the problem
better.  Is either or both halves of the AND estimated well ?

If both halves are individually estimated well, but estimated poorly together
with AND, then you have correlation.

Are either of those conditions redundant with the other ?  Half of the AND
might be unnecessary and could be removed.

-- 
Justin

Re: slow "select count(*) from information_schema.tables;" in some cases

2022-02-07 Thread Justin Pryzby

On Mon, Feb 07, 2022 at 04:56:35PM +, Lars Aksel Opsahl wrote:
> Sometimes simple sql's like this takes a very long time  "select count(*) 
> from information_schema.tables;"
> 
> Other sql's not including system tables may work ok but login also takes a 
> very long time.
> 
> The CPU load on the server is around 25%. There is no iowait.
> 
> This happens typically when we are running many functions in parallel 
> creating many temp tables and unlogged tables I think.
> 
> Here is a slow one:
> https://explain.depesz.com/s/tUt5
> 
> and here is fast one :
> https://explain.depesz.com/s/yYG4

The only difference is that this is sometimes many times slower.

 Finalize Aggregate  (cost=42021.15..42021.16 rows=1 width=8) (actual 
time=50602.755..117201.768 rows=1 loops=1)
   ->  Gather  (cost=42020.94..42021.15 rows=2 width=8) (actual 
time=130.527..117201.754 rows=3 loops=1)
 Workers Planned: 2
 Workers Launched: 2

> Here are my settings (the server has around 256 GB og memory) :

What version of postgres ?  What OS/version ?
https://wiki.postgresql.org/wiki/Slow_Query_Questions

Are there any server logs around that time ?
Or session logs for the slow query ?

Is it because the table creation is locking (rows of) various system catalogs ?
I'm not sure if it'd be a single, long delay that you could see easily with
log_lock_waits, or a large number of small delays, maybe depending on whether
your table creation is done within a transaction.

-- 
Justin

Re: Query is slower with a large proportion of NULLs in several columns

2021-12-21 Thread Justin Pryzby

On Tue, Dec 21, 2021 at 12:33:06AM -0500, Tom Lane wrote:
> So now we have a real mystery about what is happening on Lars'
> system.  Those numbers can't be right.

I realized Lars said it was x86_64/Linux, but I'm hoping to hear back with more
details:

What OS version?
Is it a VM of some type ?
How did you install postgres?  From a package or compiled from source?
grep -r HAVE_CLOCK_GETTIME /usr/pgsql-13/include
Send the exact command and output you used to run the query?
Why does your explain output have IO timing but not Buffers: hit/read ?

-- 
Justin

Re: Query is slower with a large proportion of NULLs in several columns

2021-12-20 Thread Justin Pryzby

On Mon, Dec 20, 2021 at 08:11:42PM -0800, Lars Bergeson wrote:
> ok, here are results after I did:
> set max_parallel_workers_per_gather = 0;
> 
> HashAggregate  (cost=1676432.13..1676432.16 rows=3 width=15) (actual 
> time=19908.343..19908.345 rows=5 loops=1)
>   I/O Timings: read=532369.898
> Execution Time: 19908.383 ms

> HashAggregate  (cost=1390580.70..1390580.72 rows=2 width=15) (actual 
> time=30369.758..30369.761 rows=5 loops=1)
>   I/O Timings: read=6440851.540
> Execution Time: 30369.796 ms

> Still taking 10X more I/O to read the smaller table. Very odd.

If I'm not wrong, it's even worse than that ?
It takes 20 or 30sec to run the query - but it says the associated I/O times
are ~500sec or ~6000sec ?

What architecture and OS/version are you running ?
How did you install postgres?  From a package or compiled from source ?

It might be interesting to know the output from something like this command,
depending on whether and where the headers like pg_config_x86_64.h are 
installed.

grep -r HAVE_CLOCK_GETTIME /usr/pgsql-13/include

-- 
Justin

Re: An I/O error occurred while sending to the backend (PG 13.4)

2021-12-04 Thread Justin Pryzby

On Sat, Dec 04, 2021 at 07:18:06PM +, l...@laurent-hasson.com wrote:
> It's a remote server, but all on a local network. Network performance is I am 
> sure not the issue. Also, the system is on Windows Server. What are you 
> expecting to see out of a tcpdump? I'll try to get PG logs on the failing 
> query.

I'd want to know if postgres sent anything to the client, like TCP RST, or if
the client decided on its own that there had been an error.

-- 
Justin

Re: An I/O error occurred while sending to the backend (PG 13.4)

2021-12-04 Thread Justin Pryzby

On Sat, Dec 04, 2021 at 05:32:10PM +, l...@laurent-hasson.com wrote:
> I have a data warehouse with a fairly complex ETL process that has been 
> running for years now across PG 9.6, 11.2 and now 13.4 for the past couple of 
> months. I have been getting the error "An I/O error occurred while sending to 
> the backend" quite often under load in 13.4 which I never used to get on 
> 11.2. I have applied some tricks, particularly with the socketTimeout JDBC 
> configuration.
> 
> So my first question is whether anyone has any idea why this is happening? My 
> hardware and general PG configuration have not changed between 11.2 and 13.4 
> and I NEVER experienced this on 11.2 in about 2y of production.
> 
> Second, I have one stored procedure that takes a very long time to run (40mn 
> more or less), so obviously, I'd need to set socketTimeout to something like 
> 1h in order to call it and not timeout. That doesn't seem reasonable?

Is the DB server local or remote (TCP/IP) to the client?

Could you collect the corresponding postgres query logs when this happens ?

It'd be nice to see a network trace for this too.  Using tcpdump or wireshark.
Preferably from the client side.

FWIW, I suspect the JDBC socketTimeout is a bad workaround.

-- 
Justin

Re: pg_dump backup verification

2021-11-25 Thread Justin Pryzby

On Thu, Nov 25, 2021 at 02:41:34PM +0530, Daulat wrote:
> Please suggest how I can ensure pg_dump backup has completed successfully ?
> I don't think there is any view like Oracle which helps with
> dba_datampump_jobs etc.

1) Check its exit status.  If it's nonzero, then surely there's a problem
(typically detailed indicated by output to stderr).

2) You can also run pg_restore -l to output a TOC for the backup.  From
experience, this can be a good secondary test.  You could add to your backup
script "pg_restore -l ./thebackup >/dev/null" to check that pg_restore itself
exits with a zero exit status.

3) If your backup job is a shell script, you should use "set -e", to be sure
that a command which fails causes the script to exit rather than plowing ahead
as if it had succeeded.  This is important for any shell script that's more
than 1 line long.

4) It's usually a good idea to write first to a "*.new" file, and then rename
it only if the pg_dump succeeds.  Avoid "clobbering" a pre-existing file (or
else you have no backup at all until the backup finishes, successfully).  Avoid
piping pg_dump to another command, since pipes only preserve the exit status of
the final command in the pipeline.

For example:

#! /bin/sh
set -e
f=/srv/otherfs/thebackup
rm -f "$f.new" # Remove a previous, failed backup, if any
pg_dump -Fc -d ourdatabase >"$f.new"
pg_restore -l "$f.new" >/dev/null
mv "$f.new" "$f"
exit 0 # In case the previous line is a conditional like "if" or "&&" or "||".

5) You can monitor the age of ./thebackup.

6) Log the output of the script; do not let its output get redirected to
/var/mail/postgres, or somewhere else nobody looks at.

7) It's nice to dump to a separate filesystem; not only because FS corruption
would affect both the live DB but also its backup.  But also because the
backups could overflow the FS, causing the main DB to fail queries or crash.

8) Keep a few backups rotated weekly and a few rotated monthly.  Even if it's
never needed to restore a 2 month old backup, it can be valuable to help
diagnose issues to see when some data changed.

9) Also save output from pg_dumpall -g, or else your backup will probably spew
out lots of errors, which are themselves important, but might also obscure
other, even more important problems.

10) Perhaps most importantly, test your backups.  Having backups is of little
use if you don't know how to restore them.  This should be a periodic
procedure, not something you do once to be able to say that you did.
Are you confident you can run using the restored DB ?  

-- 
Justin

Re: Need help identifying a periodic performance issue.

2021-11-24 Thread Justin Pryzby

On Wed, Nov 24, 2021 at 10:44:12PM +, Robert Creager wrote:
> I forgot, I had reloaded postgres, but had not re-started our app, so the 
> connections wouldn’t have that plan setting on them. Re-doing now.

Are you sure?  GUC changes should be applied for existing sessions, right ?

Would you send the logs surrounding the slow COPY ?
Specifically including the autovacuum logs.

> We are at it again.  I have a DELETE operation that’s taking 48 minutes so 
> far.

Before, you had slow COPY due to FKs.  Now you have a slow DELETE, which you
only alluded to before.

> So how do we avoid this query plan? Do we need to start doing explicit 
> analyzes after every delete?

If your DELETE is deleting the entire table, then I think you should VACUUM
anyway (or else the next inserts will bloat the table).

Or (preferably) use TRUNCATE instead, which will set relpages=0 and (one
supposes) avoid the bad plans.  But read the NOTE about non-mvcc behavior of
TRUNCATE, in case that matters to you.

But first, I believe Thomas was suggesting to put plan_cache_mode back to its
default, and (for testing purposes) try using issue DISCARD PLANS.

On Fri, Nov 19, 2021 at 10:08:02AM +1300, Thomas Munro wrote:
> Just to understand what's going on, it'd be interesting to know if the
> problem goes away if you *just* inject the DISCARD PLANS statement
> before running your COPYs, but if that doesn't help it'd also be
> interesting to know what happens if you ANALYZE each table after each
> COPY.  Are you running any explicit ANALYZE commands?  How long do
> your sessions/connections live for?

-- 
Justin

Re: performance of analytical query

2021-11-23 Thread Justin Pryzby

On Fri, Nov 12, 2021 at 09:12:38PM +0100, Jiří Fejfar wrote:
> * I know that PG is focused on OLTP rather then analytics, but we are happy
> with it at all and do not wish to use another engine for analytical
> queries... isn't somewhere some "PG analytical best practice" available?

It's a good question.  Here's some ideas:

I don't think we know what version you're using - that's important, and there's
other ideas here:
https://wiki.postgresql.org/wiki/Slow_Query_Questions

You said that your query was slow "probably after VACUUM ANALYZE".
Is it really faster without stats ?  You can do this to see if there was really
a better plan "before":
| begin; DELETE FROM pg_statistic WHERE starelid='thetable'::regclass; explain 
analyze ...; rollback;

Try enable_nestloop=off for analytic queries;

Test whether jit=off helps you or hurts you (you said that it's already 
disabled);

You can do other things that can improve estimates, by sacrificing planning time
(which for an analytic query is a small component of the total query time, and
pays off at runtime if you can get a btter plan):
 - FKs can help with estimates since pg9.6;
 - CREATE STATISTICS;
 - ALTER SET STATISTICS or increase default_statistics_target;
 - increase from_collapse_limit and join_collapse_limit.  But I don't think it
   will help your current query plan.
 - partitioning data increases planning time, and (if done well) can allow
   improved execution plans;

You can REINDEX or maybe CLUSTER during "off hours" to optimize indexes/tables.

BRIN indexes (WITH autoanalyze) are very successful for us, here.

You can monitor your slow queries using auto_explain and/or pg_stat_statements.

You can reduce autovacuum_analyze_threshold to analyze more often.

I'd be interested to hear if others have more suggestions.

-- 
Justin

Re: Need help identifying a periodic performance issue.

2021-11-17 Thread Justin Pryzby

On Thu, Nov 18, 2021 at 04:39:42PM +1300, Thomas Munro wrote:
> On Thu, Nov 18, 2021 at 1:18 PM Robert Creager  
> wrote:
> > So, how do I go about capturing more information for the big brains (you 
> > guys) to help figure this out?  I have all our resources at mine (and hence 
> > your) disposal.
> 
> As a workaround, does it help if you issue DISCARD PLANS before your
> COPY jobs, or alternatively start with a fresh connection?  I'm
> guessing that something like this is happening.
> 
> -- set up the auto_explain extension to show the internal foreign key check 
> queries' plans
> load 'auto_explain';
> set auto_explain.log_nested_statements = true;
> set auto_explain.log_min_duration = 0;
> set auto_explain.log_analyze = true;

..and SET client_min_messages=debug;

> drop table if exists r, s cascade;
> create table r (i int primary key);
> create table s (i int references r(i));
> 
> -- collect stats showing r as empty
> analyze r;
> 
> -- execute RI query 6 times to lock the plan (inserts fail, log shows seq 
> scan)
> insert into s values (42);
> insert into s values (42);
> insert into s values (42);
> insert into s values (42);
> insert into s values (42);
> insert into s values (42);
> 
> insert into r select generate_series(1, 100);
> 
> -- once more, we still get a seq scan, which is by now a bad idea
> insert into s values (42);
> 
> discard plans;
>
> -- once more, now we get an index scan
> insert into s values (42);

It also seems to work if one does SET plan_cache_mode=force_custom_plan;

Robert might try that, either in postresql.conf, or SET in the client that's
doing COPY.

Robert is using jdbc, which (as I recall) has this problem more often than
other clients.  But, in this case, I think JDBC isn't causing the problem.

-- 
Justin

Re: Need help identifying a periodic performance issue.

2021-11-17 Thread Justin Pryzby

On Wed, Nov 17, 2021 at 09:54:14PM +, Robert Creager wrote:
> We are able to move up to Postgres 13.5, in our ports tree, if that would 
> help.  We used pg_upgrade to get from 9.6 to 13.3, so that should work fine 
> going instead to 13.5.  We’re almost branching/releasing our code, so it’s 
> not a good time, but if it may help with this problem, we’ll deal with it.

To be clear, I have no specfic reason to believe it would help.
But it would be silly to chase down a problem that someone already fixed 10
months ago (the source of this problem, or something else that comes up).

In fact I suspect it won't help, and there's an issue with your schema, or
autovacuum, or postgres.

Note that since v10, the version scheme uses only two components, and 13.3 to
13.5 is a minor release, similar to 9.6.3 to 9.6.5.  So you don't need to use
pg_upgrade - just update the binaries.

https://www.postgresql.org/docs/13/release-13-5.html

-- 
Justin

Re: Need help identifying a periodic performance issue.

2021-11-17 Thread Justin Pryzby

On Wed, Nov 17, 2021 at 05:51:05PM +, Robert Creager wrote:
>   postgres`HeapTupleSatisfiesVisibility+0x42
>   postgres`heapgetpage+0x237
>   postgres`heapgettup_pagemode+0x5ad
>   postgres`heap_getnextslot+0x52
>   postgres`SeqNext+0x71
>   postgres`ExecScan+0xc9
>   postgres`ExecLockRows+0x7b
>   postgres`standard_ExecutorRun+0x10a
>   postgres`_SPI_execute_plan+0x524
>   postgres`SPI_execute_snapshot+0x116
>   postgres`ri_PerformCheck+0x29e
>   postgres`RI_FKey_check+0x5d3
>   postgres`RI_FKey_check_ins+0x21
>   postgres`ExecCallTriggerFunc+0x105
>   postgres`afterTriggerInvokeEvents+0x605
>   postgres`AfterTriggerEndQuery+0x7a
>   postgres`CopyFrom+0xaca
>   postgres`DoCopy+0x553
>   postgres`standard_ProcessUtility+0x5f9
>   postgres`ProcessUtility+0x28
>55

It shows that the process is running FK triggers.
Would you show \d for the table which is the destination of COPY, and for other
tables to which it has FK constraints.

Also, do you have any long-running transactions ?
In your first message, you showed no other queries except "idle" ones (not
idle-in-transaction) but I figured I'd ask anyway.

Does your COPY job run in a transaction block ?

You're running pg13.2, so it would be interesting to know if the problem exists
under 13.5.

-- 
Justin

Re: Need help identifying a periodic performance issue.

2021-11-15 Thread Justin Pryzby

On Tue, Nov 16, 2021 at 04:43:25AM +, Robert Creager wrote:
> We’re executing the following copy to fill a table with approximately 5k 
> records, then repeating for a total of 250k records.  Normally, this copy 
> executes < 1 second, with the entire set taking a couple of minutes. The 
> problem is not reproducible on command, but usually within a couple of hours 
> of starting some test runs.
> 
> COPY ds3.blob (byte_offset, checksum, checksum_type, id, length, object_id) 
> FROM STDIN WITH DELIMITER AS ‘|’
> 
> But, occasionally we get into a huge performance bottleneck for about 2 
> hours, where these copy operations are taking 140 seconds or so
> 
> Nov 15 22:25:49 sm4u-34 postgres[5799]: [381-1] 
> db=tapesystem,user=Administrator,app=PostgreSQL JDBC Driver,client=127.0.0.1 
> LOG:  duration: 145326.293 ms  statement: COPY ds3.blob (byte_offset, 
> checksum, checksum_type, id, length, object_id) FROM STDIN WITH DELIMITER AS 
> '|'

> I’m logging statements with pgbadger monitoring the logs. There are no 
> apparent auto-vacuum’s running, nor any vacuums, nor anything at all really. 
> Other select queries around that time frame are executing normally.

What about checkpoints ?

Would you show the "^checkpoint starting" and "^checkpoint complete" logs
surrounding a slow COPY ?

> We’re coming from PostgreSQL 9.6 on FreeBSD 11 where we did not see this 
> problem, but have a major release upgrade happening.  I’m checking to see if 
> this machine was updated or was a fresh install.
>  PostgreSQL 13.2 on amd64-portbld-freebsd13.0, compiled by FreeBSD clang 
> version 11.0.1 (g...@github.com:llvm/llvm-project.git 
> llvmorg-11.0.1-0-g43ff75f2c3fe), 64-bit
> 
> Changes made to the settings in the postgresql.conf file
>  checkpoint_timeout  | 30min   | 
> configuration file
>  log_checkpoints | on  | 
> configuration file
>  log_lock_waits  | on  | 
> configuration file
...
>  shared_buffers  | 21679MB | 
> configuration file

> Operating system and version:
> FreeBSD sm4u-34 13.0-STABLE FreeBSD 13.0-STABLE #0: Mon Sep 13 10:11:57 MDT 
> 2021

> These are the system calls made over 30 seconds from Postgres during a 
> slowdown.
...
>   fsync27

-- 
Justin

Re: performance of analytical query

2021-11-12 Thread Justin Pryzby

On Fri, Nov 12, 2021 at 10:55:53AM -0700, Michael Lewis wrote:
> On Thu, Nov 11, 2021 at 7:42 PM Justin Pryzby  wrote:
> 
> > BTW, we disable nested loops for the our analytic report queries.  I have
> > never
> > been able to avoid pathological plans any other way.
> 
> Curious, do you see any problems from that? Are there certain nodes that
> really are best suited to a nested loop like a lateral subquery?

When I first disabled it years ago, I did it for the entire database, and it
caused issues with a more interactive, non-analytic query, on a non-partitioned
table.

So my second attempt was to disable nested loops only during report queries,
and I have not looked back.  For our report queries on partitioned tables, the
overhead of hashing a handful of rows is of no significance.  Any query that
finishes in 1sec would be exceptionally fast.

BTW, Jiří's inquiry caused me to look at the source of one of our historic
mis-estimates, and to realize that it's resolved in pg14:
https://www.postgresql.org/message-id/2022173102.GI17618%40telsasoft.com

I doubt that's enough to avoid catastrophic nested loop plans in every case
(especially CTEs on top of CTEs).

There was a discussion about discouraging nested loop plans that weren't
provably "safe" (due to returning at most one row, due to a unique index).
https://www.postgresql.org/message-id/CA%2BTgmoYtWXNpj6D92XxUfjT_YFmi2dWq1XXM9EY-CRcr2qmqbg%40mail.gmail.com

-- 
Justin

Re: performance of analytical query

2021-11-11 Thread Justin Pryzby

On Thu, Nov 11, 2021 at 08:20:57PM +0100, Jiří Fejfar wrote:
> Hi folks,
> 
> we have found that (probably after VACUUM ANALYZE) one analytical query
> starts to be slow on our production DB. Moreover, more or less the same
> plan is used on our testing data (how to restore our testing data is
> described at the end of this email), or better to say the same problem
> exists in both (production vs testing data) scenarios: nested loop scanning
> CTE several thousand times is used due to the bad estimates:
> https://explain.dalibo.com/plan/sER#plan/node/87 (query is included on
> dalibo).

> Do you have any idea how to get HASH JOINS in the CTE w_1p_data instead of
> NESTED LOOPs?
> * Add some statistics to not get bad estimates on "lower-level" CTEs?

Do you know why the estimates are bad ?

Index Scan using t_map_plot_cell__cell_gid__idx on cm_plot2cell_mapping 
cm_plot2cell_mapping (cost=0.29..18.59 rows=381 width=12) (actual 
time=0.015..2.373 rows=3,898 loops=1)
Index Cond: (cm_plot2cell_mapping.estimation_cell = 
f_a_cell.estimation_cell)
Buffers: shared hit=110

I don't know, but is the estimate for this portion of the plan improved by 
doing:
| ALTER TABLE f_a_cell ALTER estimation_cell SET STATISTICS 500; ANALYZE 
f_a_cell;

> * In a slightly more complicated function I used temporary tables to be
> able to narrow statistics [2] but I am afraid of system table bloating
> because of the huge amount of usage of this function on the production
> (hundred thousand of calls by day when data are to be analyzed).

I would try this for sure - I think hundreds of calls per day would be no
problem.  If you're concerned, you could add manual calls to do (for example)
VACUUM pg_attribute; after dropping the temp tables.

BTW, we disable nested loops for the our analytic report queries.  I have never
been able to avoid pathological plans any other way.

Re: Fwd: Query out of memory

2021-10-19 Thread Justin Pryzby

On Tue, Oct 19, 2021 at 11:28:46AM +0530, aditya desai wrote:
> I am running the below query. Table has 21 million records. I get an Out Of
> Memory error after a while.(from both pgadmin and psql). Can someone review

Is the out of memory error on the client side ?
Then you've simply returned more rows than the client can support.

In that case, you can run it with "explain analyze" to prove that the server
side can run the query.  That returns no data rows to the client, but shows the
number of rows which would normally be returned.

-- 
Justin

Re: Lock contention high

2021-10-13 Thread Justin Pryzby

On Tue, Oct 12, 2021 at 01:05:12PM +0530, Ashkil Dighin wrote:
> Hi,
> Lock contention observed high in PostgreSQLv13.3
> The source code compiled with GNC(GCCv11.x)
> PostgreSQL version: 13.3
> Operating system:   RHEL8.3
> Kernel name:4.18.0-305.10.2.el8_4.x86_64
> RAM Size:512GB
> SSD: 1TB
> The environment used IBM metal and test benchmark environment HammerDbv4.2
> Test case :TPC-C
> 
> Perf data for 24vu(TPC-C)
> 
> 
>   18.99%  postgres  postgres[.] LWLockAcquire
>  7.09%  postgres  postgres[.] _bt_compare
>  8.66%  postgres  postgres[.] LWLockRelease
...
> 1.Is there a way to tune the lock contention ?
> 2.Is any recommendations to tune/reduce the lock contention via postgres.conf

I think you'd want to find *which* LW locks are being waited on, to see if it's
something that can be easily tuned.

You can check pg_stat_activity, or maybe create a cronjob to record its content
for later analysis.

-- 
Justin

Re: Troubleshooting a long running delete statement

2021-10-06 Thread Justin Pryzby

On Wed, Oct 06, 2021 at 06:00:07PM +, Dirschel, Steve wrote:
> •   When I did an explain on the delete I could see it was full scanning 
> the table. I did a full scan of the table interactively in less than 1 second 
> so the long runtime was not due to the full tablescan.

> I started looking at table definitions (indexes, FK's, etc.) and comparing to 
> Oracle and noticed some indexes missing.  I then could see the table being 
> deleted from was a child table with a FK pointing to a parent table.  Finally 
> I was able to see that the parent table was missing an index on the FK column 
> so for every row being deleted from the child it was full scanning the 
> parent.  All makes sense after the fact but I'm looking for a more methodical 
> way to come to that conclusion by looking at database statistics.
> 
> Are there other statistics in Postgres I may have looked at to methodically 
> come to the conclusion that the problem was the missing index on the parent 
> FK column?

I think explain (analyze on) would've helped you.

If I understand your scenario, it'd look like this:

|postgres=# explain (analyze) delete from t;
| Delete on t  (cost=0.00..145.00 rows=1 width=6) (actual 
time=10.124..10.136 rows=0 loops=1)
|   ->  Seq Scan on t  (cost=0.00..145.00 rows=1 width=6) (actual 
time=0.141..2.578 rows=1 loops=1)
| Planning Time: 0.484 ms
| Trigger for constraint u_i_fkey: time=4075.123 calls=1
| Execution Time: 4087.764 ms

You can see the query plan used for the FK trigger with autoexplain.

postgres=*# SET auto_explain.log_min_duration='0s'; SET 
client_min_messages=debug; SET auto_explain.log_nested_statements=on;
postgres=*# explain (analyze) delete from t;
|...
|Query Text: DELETE FROM ONLY "public"."u" WHERE $1 OPERATOR(pg_catalog.=) "i"
|Delete on u  (cost=0.00..214.00 rows=1 width=6) (actual rows=0 loops=1)
|  Buffers: shared hit=90
|  ->  Seq Scan on u  (cost=0.00..214.00 rows=1 width=6) (actual rows=1 loops=1)
|Filter: ($1 = i)
|Rows Removed by Filter: 8616
|Buffers: shared hit=89
|...

Re: Problem with indices from 10 to 13

2021-09-28 Thread Justin Pryzby

On Wed, Sep 29, 2021 at 02:11:15AM +, Daniel Diniz wrote:
> How do i increase  the statistics target for h.nome_des?
> And why uploading the dump at 10 and at 13 is there this difference?

It's like ALTER TABLE h ALTER nome_des SET STATISTICS 2000; ANALYZE h;
https://www.postgresql.org/docs/current/sql-altertable.html

-- 
Justin

Re: Performance for initial copy when using pg_logical to upgrade Postgres

2021-09-24 Thread Justin Pryzby

On Fri, Sep 24, 2021 at 03:28:50PM +0100, Westwood, Giles wrote:
> At Orcid we're trying to upgrade our Postgres database (10 to 13) using
> pg_logical for no downtime. The problem we have is how long the initial
> copy is taking for the ~500GB database. If it takes say 20days to complete,
> will we need to have 20days of WAL files to start catching up when it's
> complete?

Did you see this thread and its suggestions to 1) set bulk load parameters;
and, 2) drop indexes and FKs ?

https://www.postgresql.org/message-id/flat/4a8efc4e-a264-457d-a8e7-ae324ed9a...@thebuild.com

-- 
Justin

Re: Postgres chooses slow query plan from time to time

2021-09-13 Thread Justin Pryzby

On Mon, Sep 13, 2021 at 08:19:40AM -0600, Michael Lewis wrote:
> Autovacuum will only run for freezing, right? Insert only tables don't get
> autovacuumed/analyzed until PG13 if I remember right.

Tomas is talking about autovacuum running *analyze*, not vacuum.

It runs for analyze, except on partitioned tables and (empty) inheritence
parents.

-- 
Justin

Re: Big Performance drop of Exceptions in UDFs between V11.2 and 13.4 (workarounds)

2021-08-29 Thread Justin Pryzby

On Mon, Aug 30, 2021 at 04:43:23AM +0200, Pavel Stehule wrote:
> po 30. 8. 2021 v 2:44 odesílatel l...@laurent-hasson.com napsal:
> > At this point, I am not sure how to proceed except to rethink that
> > toFloat() function and many other places where we use exceptions. We get
> > such dirty data that I need a "safe" way to convert a string to float
> > without throwing an exception. BTW, I tried other combinations in case
> > there may have been some weird interactions with the ::REAL conversion
> > operator, but nothing made any change. Could you recommend another approach
> > off the top of your head? I could use regexes for testing etc... Or maybe
> > there is another option like a no-throw conversion that's built in or in
> > some extension that you may know of? Like the "SAFE." Prefix in BigQuery.
> 
> CREATE OR REPLACE FUNCTION safe_to_double_precision(t text)
> RETURNS double precision AS $$
> BEGIN
>   IF $1 SIMILAR TO '[+-]?([0-9]*[.])?[0-9]+' THEN
> RETURN $1::double precision;
>   ELSE
> RETURN NULL;
>   END IF;
> END;
> $$ LANGUAGE plpgsql IMMUTABLE STRICT;

This tries to use a regex to determine if something is a "Number" or not.
Which has all the issues enumerated in painful detail by long answers on stack
overflow, and other wiki/blog/forums.

Rather than trying to define Numbers using regex, I'd try to avoid only the
most frequent exceptions and get 90% of the performance back.  I don't know
what your data looks like, but you might try things like this:

IF $1 IS NULL THEN RETURN $2
ELSE IF $1 ~ '^$' THEN RETURN $2
ELSE IF $1 ~ '[[:alpha:]]{2}' THEN RETURN $2
ELSE IF $1 !~ '[[:digit:]]' THEN RETURN $2
BEGIN   

   RETURN $1::float;
EXCEPTION WHEN OTHERS THEN  

   RETURN $2;
END;

You can check the stackoverflow page for ideas as to what kind of thing to
reject, but it may depend mostly on your data (what is the most common string?
The most common exceptional string?).

I think it's possible that could even be *faster* than the original, since it
avoids the exception block for values which are for sure going to cause an
exception anyway.  It might be that using alternation (|) is faster (if less
readable) than using a handful of IF branches.

-- 
Justin

Re: Using regexp from table has unpredictable poor performance

2021-08-25 Thread Justin Pryzby

On Wed, Aug 25, 2021 at 11:47:43AM -0500, Jack Christensen wrote:
> I have items that need to be categorized by user defined matching rules.
> Trusted users can create rules that include regular expressions. I've
> reduced the problem to this example.

> I use the following query to find matches:
> 
> select r.id, i.id
> from items i
>   join matching_rules r on i.name ~ r.name_matches;
> 
> When there are few rules the query runs quickly. But as the number of rules
> increases the runtime often increases at a greater than linear rate.

Maybe it's because the REs are cached by RE_compile_and_cache(), but if you
loop over the REs in the inner loop, then the caching is ineffecive.

Maybe you can force it to join with REs on the outer loop by writing it as:
| rules LEFT JOIN items WHERE rules.id IS NOT NULL,
..to improve performance, or at least test that theory.

-- 
Justin

Re: Postgres using the wrong index index

2021-08-23 Thread Justin Pryzby

On Mon, Aug 23, 2021 at 08:53:15PM -0400, Matt Dupree wrote:
> Is it possible that the row estimate is off because of a column other than
> time?

I would test this by writing the simplest query that reproduces the
mis-estimate.

> I looked at the # of events in that time period and 1.8 million is
> actually a good estimate. What about the
> ((strpos(other_events_1004175222.hierarchy, '#close_onborading;'::text) <>
> 0) condition in the filter? It makes sense that Postgres wouldn't have a
> way to estimate how selective this condition is.

The issue I see is here.  I don't know where else I'd start but to understand
this.

| Index Scan using other_events_1004175222_pim_evdef_67951aef14bc_idx on 
public.other_events_1004175222 (cost=0.28..1,648,877.92 ROWS=1,858,891 
width=32) (actual time=1.008..15.245 ROWS=23 loops=1)
|Output: other_events_1004175222.user_id, other_events_1004175222."time", 
other_events_1004175222.session_id
|Index Cond: ((other_events_1004175222."time" >= '162477720'::bigint) 
AND (other_events_1004175222."time" <= '162736920'::bigint))
|Buffers: shared read=25

This has no "filter" condition, it's a "scan" node with bad over-estimate.
Note that this is due to the table's column stats, not any index's stats, so
every plan is affected. even though some happen to work well.  The consequences
of over-estimates are not as terrible as for under-estimates, but it's bad to
start with inputs that are off by 10^5.

-- 
Justin

Re: Big Performance drop of Exceptions in UDFs between V11.2 and 13.4

2021-08-22 Thread Justin Pryzby

On Sun, Aug 22, 2021 at 08:44:34PM -0300, Ranier Vilela wrote:
> > If there is any way I can help further... I am definitely not able to do a
> > dev environment and local build, but if we have a windows developer
> > reproducing the issue between 11 and 12, then that should help. If someone
> > makes a debug build available to me, I can provide additional help based on
> > that.
>
> Please, download from this link (Google Drive):
> 
> https://drive.google.com/file/d/13kPbNmk54lR6t-lwcwi-63UdM55sA27t/view?usp=sharing

Laurent gave a recipe to reproduce the problem, and you seemed to be able to
reproduce it, so I think Laurent's part is done.  The burden now lies with
postgres developers to isolate the issue, and Andrew said he would bisect to
look for the culprit commit.

-- 
Justin

Re: Big Performance drop of Exceptions in UDFs between V11.2 and 13.4

2021-08-22 Thread Justin Pryzby

On Sun, Aug 22, 2021 at 10:50:47AM -0300, Ranier Vilela wrote:
> > Tried to check this with Very Sleepy at Windows 10 (bare metal).
> > Not sure it can help if someone can guide how to test this better?

> explain (analyze,buffers,COSTS,TIMING) select MAX(toFloat(a, null)) as "a" 
> from sampletest;

Your 100sec result *seems* to reproduce the problem, but it'd be more clear if
you showed the results of both queries (toFloat(a) vs toFloat(b)).
Laurent's queries took 800sec vs 2sec.

> postgres.png (print screen from Very Sleepy)
> postgres.csv

This looks useful, thanks.  It seems like maybe win64 builds are very slow
running this:

exec_stmt_block() /
BeginInternalSubTransaction() /
AbortSubTransaction() /
reschedule_timeouts() /
schedule_alarm() / 
setitimer() /
pg_timer_thread() /
WaitForSingleObjectEx () 

We should confirm whether there's a dramatic regression caused by postgres
source code (and not by compilation environment or windows version changes).
Test if there's a dramatic difference between v11 and v12, or v12 and v13.
To be clear, the ~4x difference in v11 between Laurent's "exceptional" and
"nonexceptional" cases is expected.  But the 400x difference in v13 is not.

If it's due to a change in postgres source code, we should find what commit
caused the regression.

First, check if v12 is affected.  Right now, we know that v11.2 is ok and v13.4
is not ok.  Then (unless someone has a hunch where to look), you could use git
bisect to find the culprit commit.

Git log shows 85 commits affecting those files across the 2 branches - once we
determine whether v12 is affected, that alone eliminates a significant fraction 
of
the commits to be checked.

git log --oneline --cherry-pick origin/REL_11_STABLE...origin/REL_13_STABLE 
src/backend/access/transam/xact.c src/backend/port/win32/timer.c 
src/backend/utils/misc/timeout.c src/pl/plpgsql/src/pl_exec.c

-- 
Justin

Re: Big Performance drop of Exceptions in UDFs between V11.2 and 13.4

2021-08-21 Thread Justin Pryzby

Could you send SELECT * FROM pg_config()
and try to find the CPU model ?

I think it's possible the hypervisor is trapping and emulating unhandled CPU
instructions.

Actually, it would be interesting to see if the performance differs between
11.2 and 11.13.  It's possible that EDB compiled 11.13 on a newer CPU (or a
newer compiler) than 11.2 was compiled.

If you test that, it should be on a separate VM, unless the existing data dir
can be restored from backup.  Once you've started a cluster with updated
binaries, you should avoid downgrading the binaries.

Re: Big Performance drop of Exceptions in UDFs between V11.2 and 13.4

2021-08-21 Thread Justin Pryzby

On Sat, Aug 21, 2021 at 02:19:50PM -0500, Justin Pryzby wrote:
> As I recall, you're running postgres under a windows VM - I'm not sure if
> that's relevant.

I tried under a couple hyperv VMs but could not reproduce the issue (only an
~8x difference "with exceptions").

Which hypervisor are you using ?

I don't know if any of it matters, but would you also send:

SELECT version();
SELECT * FROM pg_config();

And maybe the CPU info ?

-- 
Justin

Re: Big Performance drop of Exceptions in UDFs between V11.2 and 13.4

2021-08-21 Thread Justin Pryzby

On Sat, Aug 21, 2021 at 02:17:26PM -0400, Tom Lane wrote:
> "l...@laurent-hasson.com"  writes:
> > So you mean that on average, the 4x overhead of exceptions is around what 
> > you'd expect?
> 
> Doesn't surprise me any, no.  Exception recovery has to clean up after
> a wide variety of possible errors, with only minimal assumptions about
> what the system state had been.  So it's expensive.  More to the point,
> the overhead's been broadly the same for quite some time.
> 
> > As for results in general, yes, your numbers look pretty uniform across 
> > versions. On my end, comparing V11.2 vs V13.4 shows a much different 
> > picture!
> 
> I'm baffled why that should be so.  I do not think any of the extensions
> you mention add any exception-recovery overhead, especially not in
> sessions that haven't used them.

Laurent, did you install binaries for v13.4 or compile it ?

What about these ?

SHOW shared_preload_libraries;
SHOW session_preload_libraries;
SHOW local_preload_libraries;

Would you try to reproduce the issue with a fresh database:
CREATE DATABASE udftest; ...

Or a fresh instance created with initdb.

As I recall, you're running postgres under a windows VM - I'm not sure if
that's relevant.

-- 
Justin

Re: Postgres using the wrong index index

2021-08-17 Thread Justin Pryzby

On Mon, Aug 16, 2021 at 11:22:44AM -0400, Matt Dupree wrote:
> > Is either half of the AND estimated correctly?  If you do a query
> > with only ">=", and a query with only "<=", do either of them give an
> > accurate rowcount estimate ?
> 
> Dropping >= results in the correct index being used. Dropping <= doesn't
> have this effect.

This doesn't answer the question though: are the rowcount estimes accurate (say
within 10%).

It sounds like interpolating the histogram is giving a poor result, at least
over that range of values.  It'd be interesting to see the entire histogram.

You might try increasing (or decreasing) the stats target for that column, and
re-analyzing.

Your histogram bounds are for ~38 months of data, and your query is for the
previous month (July).

$ date -d @1530186399
Thu Jun 28 06:46:39 CDT 2018
$ date -d @1629125609
Mon Aug 16 09:53:29 CDT 2021

$ date -d @1627369200
Tue Jul 27 02:00:00 CDT 2021
$ date -d @1624777200
Sun Jun 27 02:00:00 CDT 2021

The timestamp column has ndistinct near -1, similar to a continuous
distribution, so I'm not sure why the estimate would be so bad.

-- 
Justin

Re: Postgres using the wrong index index

2021-08-12 Thread Justin Pryzby

On Thu, Aug 12, 2021 at 09:38:45AM -0400, Matt Dupree wrote:
> > The rowcount estimate for the time column is bad for all these plans - do 
> > you
> > know why ?  You're using inheritence - have you analyzed the parent tables 
> > recently ?
> 
> Yes. I used ANALYZE before posting, as it's one of the "things to try"
> listed in the slow queries wiki. I even ran the queries immediately after
> analyzing. No difference. Can you say more about why the bad row estimate
> would cause Postgres to use the bigger index? I would expect Postgres to
> use the smaller index if it's over-estimating how many rows will be
> returned.

The overestimate is in the table's "time" column (not index) and applies to all
the plans.  Is either half of the AND estimated correctly?  If you do a query
with only ">=", and a query with only "<=", do either of them give an accurate
rowcount estimate ?

|Index Scan using other_events_1004175222_pim_evdef_67951aef14bc_idx on 
public.other_events_1004175222 (cost=0.28..1,648,877.92 rows=1,858,891 
width=32) (actual time=1.008..15.245 rows=23 loops=1) 
|Index Cond: ((other_events_1004175222."time" >= '162477720'::bigint) AND 
(other_events_1004175222."time" <= '162736920'::bigint))

It seems like postgres expects the scan to return a large number of matching
rows, so tries to use the more selective index which includes the "type"
column.  But "type" is not very selective either (it has only 4 distinct
values), and "time" is not the first column, so it reads a large fraction of
the table, slowly.

Could you check pg_stat_all_tables and be sure the last_analyzed is recent for
both parent and child tables ?

Could you send the histogram bounds for "time" ?
SELECT tablename, attname, inherited, array_length(histogram_bounds,1), 
(histogram_bounds::text::text[])[1], 
(histogram_bounds::text::text[])[array_length(histogram_bounds,1)]
FROM pg_stats ... ;

-- 
Justin

Re: Postgres using the wrong index index

2021-08-11 Thread Justin Pryzby

The rowcount estimate for the time column is bad for all these plans - do you
know why ?  You're using inheritence - have you analyzed the parent tables
recently ?

| Index Scan using other_events_1004175222_pim_evdef_67951aef14bc_idx on 
public.other_events_1004175222 (cost=0.28..1,648,877.92 rows=1,858,891 
width=32) (actual time=1.008..15.245 rows=23 loops=1)
|Index Cond: ((other_events_1004175222."time" >= '162477720'::bigint) 
AND (other_events_1004175222."time" <= '162736920'::bigint))

-- 
Justin

Re: Postgres using the wrong index index

2021-08-11 Thread Justin Pryzby

On Tue, Aug 10, 2021 at 12:47:20PM -0400, Matt Dupree wrote:
> Here's the plan: https://explain.depesz.com/s/uNGg
> 
> Note that the index being used is

Could you show the plan if you force use of the intended index ?
For example by doing begin; DROP INDEX indexbeingused; explain thequery; 
rollback;
Or: begin; UPDATE pg_index SET indisvalid=false WHERE 
indexrelid='indexbeingused'::regclass explain thequery; rollback;

Could you show the table statistics for the time, user_id, and type columns on
all 4 tables ?
| SELECT (SELECT sum(x) FROM unnest(most_common_freqs) x) frac_MCV, tablename, 
attname, inherited, null_frac, n_distinct, array_length(most_common_vals,1) 
n_mcv, array_length(histogram_bounds,1) n_hist, correlation FROM pg_stats WHERE 
attname='...' AND tablename='...' ORDER BY 1 DESC; 

It might be interesting to see both query plans when index scans are disabled
and bitmap scan are used instead (this might be as simple as begin; SET LOCAL
enable_indexscan=off ...; rollback;);

> Also note that these child tables have 100s of partial indexes. You
> can find history on why we have things set up this way here
> .

I have read it before :)

> SELECT relname, relpages, reltuples, relallvisible, pg_table_size(oid)
> FROM pg_class WHERE relname = 'other_events_1004175222';

Could you also show the table stats for the two indexes ?

One problem is that the rowcount estimate is badly off:
| Index Scan using other_events_1004175222_pim_core_custom_2_8e65d072fbdd_idx 
on public.other_events_1004175222 (cost=0.57..1,213,327.64 rows=1,854,125 
width=32) (actual time=450.588..29,057.269 rows=23 loops=1) 

To my eyes, this looks like a typo ; it's used in the index predicate as well
as the query, but maybe it's still relevant ?
| #close_onborading

-- 
Justin

Re: Slow query because lexeme index not used

2021-08-07 Thread Justin Pryzby

On Sat, Aug 07, 2021 at 07:35:25PM +, Alex wrote:
> Table "product" has a GIN index on "lexeme" column (tsvector) that is not 
> used.
> 
> Query that doesn't use lexeme idx:  https://explain.dalibo.com/plan/BlB#plan, 
> ~8s, ~60.000 blocks needed
> 
> Query forced to use lexeme idx: https://explain.dalibo.com/plan/i52, ~800ms 
> (10x less), ~15.000 blocks needed (x4 less)

Could you show the table stats for product.id ?  In particular its
"correlation".

I guess the correlation is ~1, and the 10,659 index scans on product.id are
considered to be cheaper than scannning the lexeme index - since there are no
correlation stats for tsvector.

How large is shared_buffers ?

Does the query plan improve if you increase work_mem ?

Maybe you could encourage scanning in order of product_property.product.
You could CLUSTER product_property_default on an index on "product" and then
ANALYZE.  Or you could write the query with a temp table:

CREATE TEMP TABLE product_ids AS
SELECT product
FROM product_property
WHERE "meaning" = 'B' AND "first" = 1.7179869184E10
GROUP BY 1 -- or DISTINCT, because the table is only used for EXISTS
ORDER BY 1; -- to scan product in order of id
ANALYZE product_ids;

The index scans on product.id should be faster when you use
EXISTS(SELECT 1 FROM product_ids ...), even though it didn't use the lexeme 
index.

Maybe it would help to create stats on "first" and "meaning"; the rowcount is
underestimated by 3x, which means it did several times more index scans into
"product" than planned.
| Bitmap Heap Scan on product_property_default product_property_default 
(cost=2,748.6..8,823.4 rows=6,318 width=4) (actual time=43.945..211.621 
rows=21,061 loops=1) 

CREATE STATISTICS first_meaning ON first,meaning FROM product_property;
ANALYZE product_property;

> Table metdata:
>  relname  | relpages | reltuples | relallvisible | relkind | 
> relnatts | relhassubclass | reloptions | pg_table_size
> --+--+---+---+-+--+++---
>  product_property_default | 8992 |    622969 |  8992 | r   |  
>  16 | f  |    |  73719808
>  product  |    49686 |    413840 | 49686 | r   |  
>  14 | f  |    | 493314048
>
> Table stats:
>    frac_mcv    |    tablename | attname | inherited | null_frac | 
> n_distinct  | n_mcv | n_hist | correlation
> ---+--+-+---+---+-+---++-
>    | product  | lexeme  | f | 0 | 
>  -1 |   |    |
>     0.99773335 | product_property_default | meaning | f | 0 | 
>  63 |    39 | 24 |  0.19444875
>  0.6416333 | product_property_default | first   | f | 0 | 
>    2193 |   100 |    101 | -0.09763639
>  0.0002334 | product_property_default | product | f | 0 | 
> -0.15221785 | 1 |    101 |  0.08643274
> 
> 
> Using windows docker with wsl2.Both cases are run with cold cache.All 
> database memory is limited to 1GB by using .wslconfig file with memory=1GB, 
> also the docker container is limited to 1GB. 
> My requirement is to optimize disk access with this limited memory

1 2 3 4 >

1 - 100 of 360 matches

Mail list logo