Well, I can give a measurement on my home PC, a Linux box running Ubuntu
17.10 with a Samsung 960 EVO 512GB NVME disk containing Postgres 10. Using
your pgbench init I got for example:
pgbench -c 10 -t 1 test
starting vacuum...end.
transaction type:
scaling factor: 100
query mode: simple
numb
You might be comparing apples and pears..
Your Oracle is running on prem while Postgres is running on Azure. Azure
does not really have disks; it seems to have just a bunch of old people
writing the data on paper - I/O on Azure is ridiculously slow. What
disks/hardware does the on-prem Oracle have
Hi list,
I have a misbehaving query which uses all available disk space and then
terminates with a "cannot write block" error. To prevent other processes
from running into trouble I've set the following:
temp_file_limit = 100GB
The query does parallelize and uses one parallel worker while execut
Hi Justin, thanks for your help!
Simple things first:
- I am running a single query on a developer machine. Nothing else uses the
database at that point.
- The database runs on a disk that has 473GB in use and 1.3T still free. I
am watching the increase in size used (watch df -hl /d2).
- If I remo
Hi Tom and Thomas, thanks for your help.
@Tom:
If it really is per-process then I would have expected it to die after
200GB was used?
As far as "valid bug" is concerned: I had hoped this would be per session,
as this at least delivers a reasonable and usable limit; it is easy to
control the number
Hi Ranier, thanks for your help.
I do not have more disks lying around, and I fear that if it does not
complete with 1.3TB of disk space it might not be that likely that adding
750GB would work...
Postgres version: the original (prd) issue was on 10.x. I also tested it on
14.x with the same issue.
Hehehe, that is not the worst plan ;) I did that once to debug a deadlock
in the JDBC driver when talking with Postgres, but it's not an adventure
I'd like to repeat right now ;)
>
@justin:
Ran the query again. Top shows the following processes:
PID USER PR NIVIRTRESSHR S %CPU %MEM TIME+
COMMAND
650830 postgres 20 0 7503,2m 2,6g 2,6g R 100,0 4,2 12:46.34
postgres: jal datavault_317_prd [local] EXPLAIN
666141 postgres 20 0 7486,3m
at we know
it can do, reliably.
On Mon, Dec 19, 2022 at 6:46 PM Justin Pryzby wrote:
> On Mon, Dec 19, 2022 at 06:27:57PM +0100, Frits Jalvingh wrote:
> > I have listed the files during that run,
>
> > 213M -rw--- 1 postgres postgres 213M dec 19 17:46 i100of128.p0.0
> >
@ranier
These files ONLY exist during the query. They get deleted as soon as the
query terminates, by Postgres itself. Once the query terminates pgsql_tmp
is completely empty. Considering what Thomas said (and the actual
occurrence of the files he mentioned) this does seem to be the more likely
cau
@justin
I tried the create statistics variant and that definitely improves the
estimate, and with that one of the "bad" cases (the one with the 82 minute
plan) now creates a good plan using only a few seconds.
That is a worthwhile path to follow. A bit hard to do, because those
conditions can be a
@Thomas
Thanks for helping identifying the issue. I think it would be nice to clean
up those obsoleted files during the run because running out of disk is
reality not a good thing to have ;) Of course the bad estimates leading to
the resize are the real issue but this at least makes it less bad.
20, 2022 at 10:11 PM Justin Pryzby wrote:
> On Mon, Dec 19, 2022 at 09:10:27PM +0100, Frits Jalvingh wrote:
> > @justin
> >
> > I tried the create statistics variant and that definitely improves the
> > estimate, and with that one of the "bad" cases (the o
Hi list,
We have an application that generates SQL statements that are then executed
on a postgresql database. The statements are always "bulk" type statements:
they always return a relatively large amount of data, and have only a few
not very selective filter expressions. They do contain a terrib
tually wondered whether that
would be a cause of the issue, because as far as costs are concerned that
second nested loops only _increases_ the cost by 2 times...
Regards,
Frits
On Tue, Nov 17, 2020 at 3:20 PM Justin Pryzby wrote:
> On Tue, Nov 17, 2020 at 02:47:55PM +0100, Frits Jalvingh wro
Ah, sorry, I forgot. I set "hash_mem_multiplier = 2", and after that to 20.
It did had no effects on the nested loops.
On Tue, Nov 17, 2020 at 4:58 PM Frits Jalvingh wrote:
> Hi Justin, thanks for your help!
> I have attached both plans, both made with set enable_nestloop
nested loop on top, but the total cost is now:
GroupAggregate (cost=2005652.88..2005652.90 rows=370 width=68)
On Tue, Nov 17, 2020 at 5:08 PM Frits Jalvingh wrote:
> Ah, sorry, I forgot. I set "hash_mem_multiplier = 2", and after that to
> 20. It did had no effects on
using SQL and doing them by hand in code.. That would prove to be
disastrous for performance as I'd have to read all those datasets
completely... Do you have an idea on how to do that better?
Regards,
Frits
On Tue, Nov 17, 2020 at 5:21 PM Tom Lane wrote:
> Frits Jalvingh writes:
> &
at 5:42 PM Frits Jalvingh wrote:
> Hello Tom, thanks for your help!
>
> I understand that the "time" table cross join needs a nested loop. Indeed
> that nested loop is present in all plans generated.
> But it is the _second_ (topmost) nested loop that is the issue. Once
It completely depends on a lot of factors of course, so these numbers are
meaningless.
It depends at the very least on:
* The hardware (CPU, disk type + disk connection)
* The size of the records read/written
* The presence of indices and constraints.
So, adding some other meaningless numbers to a
20 matches
Mail list logo