Re: [PERFORM] Hash Anti Join performance degradation

panam Thu, 26 May 2011 05:34:21 -0700

Hi there,


Kevin Grittner wrote:
> 
>> Is there a way to determine the values actually used?
> The pg_settings view.  Try the query shown here:
> http://wiki.postgresql.org/wiki/Server_Configuration
> 
Thanks Kevin, very usful. Here is the output:

"version";"PostgreSQL 9.0.4, compiled by Visual C++ build 1500, 64-bit"
"bytea_output";"escape"
"client_encoding";"UNICODE"
"effective_cache_size";"4GB"
"lc_collate";"German_Germany.1252"
"lc_ctype";"German_Germany.1252"
"listen_addresses";"*"
"log_destination";"stderr"
"log_line_prefix";"%t "
"logging_collector";"on"
"max_connections";"100"
"max_stack_depth";"2MB"
"port";"5432"
"server_encoding";"UTF8"
"shared_buffers";"1GB"
"temp_buffers";"4096"
"TimeZone";"CET"
"work_mem";"1GB"


Craig Ringer wrote:
> 
> On 05/26/2011 12:42 AM, panam wrote:
> It's a bit beyond me, but I suspect that it'd be best if you could hang 
> onto the dump file in case someone has the time and enthusiasm to 
> investigate it. I take it you can't distribute the dump file, even 
> privately?
> 
Fortunately, I managed to reduce it to the absolute minimum (i.e. only
meaningless ids), and the issue is still observable.
You can download it from here:
http://www.zumodrive.com/file/460997770?key=cIdeODVlNz

Some things to try:
* tune your psql settings if you want
* reindex, vaccum analzye if you want

"Patholgical" query:

select
        b.id,
        (SELECT
                m1.id 
        FROM
                message m1 
        LEFT JOIN
                message m2 
                        ON (
                                m1.box_id = m2.box_id 
                                AND m1.id < m2.id
                        ) 
        WHERE
                m2.id IS NULL 
                AND m1.box_id = b.id)
from
        box b

=> takes almost "forever" (~600 seconds on my system)

Try

delete from message where id > 2550000;

=> deletes 78404 rows
Do the "pathological" query again
=> speed is back (~4 seconds on my system)

Replay the dump
Try

delete from message where id < 1000000;

=> deletes 835844 (10 times than before) rows. Maybe you can delete many
more, I haven't tested this systematically.
Do the "pathological" query again
=> takes almost "forever" (didn't wait...)

Replay the dump
Cluster:

cluster message_pkey on message;

Do the "pathological" query again
=> speed is back (~3 seconds on my system)

Any third party confirmation?

Thanks
panam

--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Hash-Anti-Join-performance-degradation-tp4420974p4428435.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.

-- 
Sent via pgsql-performance mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Hash Anti Join performance degradation

Reply via email to