Re: [PERFORM] Query optimization help

2011-08-30 Thread Ondrej Ivanič
Hi,

On 30 August 2011 15:36, Szymon Kosok szy...@mwg.pl wrote:
 Hello,

 I asked that question on StackOverflow, but didn't get any valuable
 response, so I'll ask it here. :)

 I have such query:

Could you please re-post your explain using this web site:
http://explain.depesz.com/ and post links to Stackoverflow question?
What is your Postgres version? Database settings?
I see huge discrepancy between predicted and actual row numbers (like
1264420 vs 485). I would try the following:

- check column statistics (pg_stasts) and focus on the following
columns: n_distinct, null_frac, most_common_vals. If they are way-off
from the actual values then you should tweak (auto)analyze process:
run manual/auto analyse more often (check pg_stat_user_tables),
increase default_statistics_target (per column or global)

- try to disable nested loop join (set enable_nestloop=off)

-- 
Ondrej Ivanic
(ondrej.iva...@gmail.com)

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Re: How to track number of connections and hosts to Postgres cluster

2011-08-30 Thread Scott Marlowe
On Mon, Aug 29, 2011 at 11:55 PM, Venkat Balaji venkat.bal...@verse.in wrote:
 If i notice high IO's and huge log generation, then i think Greg Spileburg
 has suggested a good idea of using tcpdump on a different server. I would
 use this utility and see how it works (never used it before). Greg
 Spileburg, please  help me with any sources of documents you have to use
 tcpdump.

There's also a lot to be said for dumping to a dedicated local drive
with fsync turned off.  They're logs so you can chance losing them by
putting them on a cheap fast 7200 rpm SATA drive.  If your logs take
up more than a few megs a second then they are coming out really fast.
 Do you know what your log generation rate in bytes/second is?

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Query optimization help

2011-08-30 Thread Szymon Kosok
2011/8/30 Ondrej Ivanič ondrej.iva...@gmail.com:
 Could you please re-post your explain using this web site:
 http://explain.depesz.com/ and post links to Stackoverflow question?

Here it is: http://explain.depesz.com/s/Iaa

 - try to disable nested loop join (set enable_nestloop=off)

Even worse performance (http://explain.depesz.com/s/mMi).

My configuration:http://pastie.org/2453148 (copied and pasted only
uncommented important variables). It's decent hardware. i7, 16 GB of
RAM, 3x2 RAID 10 (7200rpm) for OS + data, RAID 1 (2 disks, 7200rpm)
for WAL, RAID controller with BBU and 512 MB memory cache (cache is
set to write only).

PS. Sorry Ondrej, accidentally I've sent reply to you, not to list.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] 8.4 optimization regression?

2011-08-30 Thread Grzegorz Jaśkiewicz
2011/8/29 Mark Kirkwood mark.kirkw...@catalyst.net.nz:

 I note from the commit message that the fix test case was from Grzegorz
 Jaskiewicz (antijoin against a small subset of a relation).  I was not able
 to find this in the archives - Grzegorz do you recall the actual test case?
 I thought it might be useful for me to spend some time studying both cases
 and seeing if I can come up with any tweaks that would let both your and my
 queries work well!

Sorry, I don't remember that particular example. If I complained about
it, it would  have been on this list or the general list.
I'll have a look by date.

-- 
GJ

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Query optimization help

2011-08-30 Thread Ondrej Ivanič
Hi,

2011/8/30 Szymon Kosok szy...@mwg.pl:
 2011/8/30 Ondrej Ivanič ondrej.iva...@gmail.com:
 Could you please re-post your explain using this web site:
 http://explain.depesz.com/ and post links to Stackoverflow question?

 Here it is: http://explain.depesz.com/s/Iaa

 - try to disable nested loop join (set enable_nestloop=off)

Thanks, I would try to materialise spoleczniak_tablica table. Your
query looks like this:
select ...
from spoleczniak_tablica
inner join ...
where ...
order by spoleczniak_tablica.id desc
limit 21

So I would rewrite your query like this:
select ...
from (
 select ...
 from spoleczniak_tablica
 where 
 order by spoleczniak_tablica.id desc
 limit 21
) as x
inner join ...


-- 
Ondrej Ivanic
(ondrej.iva...@gmail.com)

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Summaries on SSD usage?

2011-08-30 Thread Stefan Keller
Hi,

I'm looking for summaries (or best practices) on SSD usage with PostgreSQL.
My use case is mainly a read-only database.
Are there any around?

Yours, Stefan

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] IN or EXISTS

2011-08-30 Thread Andy Colson

Hi all,

I have read things someplace saying not exists was better than not in... 
or something like that.  Not sure if that was for in/exists and not 
in/not exists, and for a lot of records or not.


Here is my setup:

My website has a general table, let say 60k rows.  Its mostly read-only. 
 Every once and a while we get updated data, so I:

create schema upd;
create table upd.general(like public.general);

Then I dump the new data into upd.general.  (This has many table's and 
steps, I'm simplifying it here).


For the last step, I want to:

begin;
delete from public.general where gid in (select gid from upd.general);
insert into public.general select * from upd.general;
... 7 other tables same way ...
commit;


Most of the time upd.general will be  500 rows.  Every once and a while 
things get messed up and we just update the entire database, so count(*) 
upd.general == count(*) public.general.


My question is:
fast is nice, but safe and less resource intensive is better, so which 
would I probably like better:


delete from public.general where gid in (select gid from upd.general);

or

-- currently dont have and index, so
create index general_pk on upd.general(gid);
delete from public.general a where exists(select 1 from upd.general b 
where a.gid=b.gid);



Thanks for any suggestions,

-Andy

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] 8.4 optimization regression?

2011-08-30 Thread Mark Kirkwood

On 30/08/11 21:43, Grzegorz Jaśkiewicz wrote:

2011/8/29 Mark Kirkwoodmark.kirkw...@catalyst.net.nz:


I note from the commit message that the fix test case was from Grzegorz
Jaskiewicz (antijoin against a small subset of a relation).  I was not able
to find this in the archives - Grzegorz do you recall the actual test case?
I thought it might be useful for me to spend some time studying both cases
and seeing if I can come up with any tweaks that would let both your and my
queries work well!

Sorry, I don't remember that particular example. If I complained about
it, it would  have been on this list or the general list.
I'll have a look by date.



Thanks - however I think I have managed to make up a good test case that 
shows the particular commit working. More on that to come!


Cheers

Mark

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance