Re: [PERFORM] Slow Query

2005-07-14 Thread Ragnar Hafstað
On Thu, 2005-07-14 at 10:06 +1000, Marc McIntyre wrote:

 I'm having a problem with a query that performs a sequential scan on a 
 table when it should be performing an index scan. The interesting thing 
 is, when we dumped the database on another server, it performed an index 
 scan on that server.
...
 The EXPLAIN ANALYZE from the system performing an sequential scan:
 
 QUERY PLAN
 Sort  (cost=30079.79..30079.89 rows=42 width=113) (actual 
 time=39889.989..39890.346 rows=260 loops=1)
...
 The EXPLAIN ANALYZE from the system performing an index scan scan:
 Sort  (cost=16873.64..16873.74 rows=40 width=113) (actual 
 time=2169.905..2169.912 rows=13 loops=1)

looks like the first query is returning 260 rows,
but the second one 13

this may not be your problem, but are you sure you are using the same
query on the same data here ?

gnari



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Planner issue

2005-03-22 Thread Ragnar Hafstað
On Tue, 2005-03-22 at 14:36 -0500, Alex Turner wrote:

 I will use an index 220-300, but not 200-300.
 ...
  Seq Scan on propmain  (cost=0.00..15517.56 rows=6842 width=4) (actual
 time=0.039..239.760 rows=6847 loops=1)
 ...
  Index Scan using propmain_listprice_i on propmain 
 (cost=0.00..22395.95 rows=6842 width=4) (actual time=0.084..25.751
 rows=6847 loops=1)

the rows estimates are accurate, so it is not a question of statistics
anymore.

first make sure effective_cache_size is correctly set, and then 
if that is not enough, you might try to lower random_page_cost a bit


gnari



---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] How to read query plan

2005-03-13 Thread Ragnar Hafstað
On Sun, 2005-03-13 at 16:32 +0100, Miroslav ulc wrote:
 Hi all,
 
 I am new to PostgreSQL and query optimizations. We have recently moved 
 our project from MySQL to PostgreSQL and we are having performance 
 problem with one of our most often used queries. On MySQL the speed was 
 sufficient but PostgreSQL chooses time expensive query plan. I would 
 like to optimize it somehow but the query plan from EXPLAIN ANALYZE is 
 little bit cryptic to me.
 

[snip output of EXPLAIN ANALYZE]

for those of us who have not yet reached the level where one can
infer it from the query plan, how abour showing us the actual
query too ?

but as an example of what to look for, consider the first few lines
(reformatted): 

 Merge Right Join  (cost=9868.84..9997.74 rows=6364 width=815) 
   (actual time=9982.022..10801.216 rows=6364 loops=1)
   Merge Cond: (outer.idpk = inner.cadastralunitidfk)
   -  Index Scan using cadastralunits_pkey on cadastralunits  
   (cost=0.00..314.72 rows=13027 width=31)
   (actual time=0.457..0.552 rows=63 loops=1)
   -  Sort  (cost=9868.84..9884.75 rows=6364 width=788)
 (actual time=9981.405..10013.708 rows=6364 loops=1)

notice that the index scan is expected to return 13027 rows, but
actually returns 63. this might influence the a choice of plan.

gnari




---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] Help with tuning this query

2005-03-02 Thread Ragnar Hafstað
On Wed, 2005-03-02 at 01:51 -0500, Ken Egervari wrote:
  
 select s.*
 from shipment s
 inner join carrier_code cc on s.carrier_code_id = cc.id
 inner join carrier c on cc.carrier_id = c.id
 inner join carrier_to_person ctp on ctp.carrier_id = c.id
 inner join person p on p.id = ctp.person_id
 inner join shipment_status cs on s.current_status_id = cs.id
 inner join release_code rc on cs.release_code_id = rc.id
 left join shipment_status ss on ss.shipment_id = s.id
 where
 p.id = :personId and
 s.is_purged = false and
 rc.number = '9' and
 cs is not null and
 cs.date = current_date - 31
 order by cs.date desc
 ... 
 shipment contains 40,000 rows
 shipment_status contains 80,000 rows

I may be missing something, but it looks like the second join
on shipment_status (the left join) is not adding anything to your
results, except more work. ss is not used for output, nor in the where
clause, so what is its purpose ?

if cs.date has an upper limit, it might be helpful to change the
condition to a BETWEEN

in any case, i would think you might need an index on
  shipment(carrier_code_id)
  shipment(current_status_id)
  shipment_status(id)

gnari




---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] Help with tuning this query

2005-03-02 Thread Ragnar Hafstað
On Wed, 2005-03-02 at 13:28 -0500, Ken Egervari wrote:
  select s.*
  from shipment s
  inner join carrier_code cc on s.carrier_code_id = cc.id
  inner join carrier c on cc.carrier_id = c.id
  inner join carrier_to_person ctp on ctp.carrier_id = c.id
  inner join person p on p.id = ctp.person_id
  inner join shipment_status cs on s.current_status_id = cs.id
  inner join release_code rc on cs.release_code_id = rc.id
  left join shipment_status ss on ss.shipment_id = s.id
  where
  p.id = :personId and
  s.is_purged = false and
  rc.number = '9' and
  cs is not null and
  cs.date = current_date - 31
  order by cs.date desc
 
  I may be missing something, but it looks like the second join
  on shipment_status (the left join) is not adding anything to your
  results, except more work. ss is not used for output, nor in the where
  clause, so what is its purpose ?
 ...  The second 
 left join is for eager loading so that I don't have to run a seperate query 
 to fetch the children for each shipment.  This really does improve 
 performance because otherwise you'll have to make N+1 queries to the 
 database, and that's just too much overhead.

are you saying that you are actually doing a
  select s.*,ss.* ...
?

  if cs.date has an upper limit, it might be helpful to change the
  condition to a BETWEEN
 
 Well, I could create an upper limit. It would be the current date.  Would 
 adding in this redundant condition improve performance?

it might help the planner estimate better the number of cs rows 
affected. whether this improves performance depends on whether
the best plans are sensitive to this.

an EXPLAIN ANALYSE might reduce the guessing.

gnari



---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] PgPool changes WAS: PostgreSQL clustering VS MySQL

2005-01-24 Thread Ragnar Hafstað
On Mon, 2005-01-24 at 09:52 -0800, Josh Berkus wrote:
 [about keeping connections open in web context]
 Ah, clarity problem here.I'm talking about connection pooling tools from 
 the client (webserver) side, such as Apache::DBI, PHP's pg_pconnect, 
 Jakarta's connection pools, etc.   Not pooling on the database server side, 
 which is what pgPool provides.

note that these sometimes do not provide connection pooling as such,
just persistent connections (Apache::DBI)

 Most of these tools allocate a database connection to an HTTP/middleware 
 client, and only release it after a specific period of inactivity.This 
 means that you *could* count on web-user==connection for purposes of 
 switching back and forth to the master -- as long as the connection-recycling 
 timeout were set higher than the pgPool switch-off period.

no. you can only count on web-server-process==connection, but not
web-user==connection, unless you can garantee that the same user
client always connects to same web-server process.

am i missing something ?

gnari



---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PERFORM] index scan of whole table, can't see why

2005-01-20 Thread Ragnar Hafstað
On Wed, 2005-01-19 at 21:00 -0800, [EMAIL PROTECTED] wrote:
 Let's see if I have been paying enough attention to the SQL gurus. 
 The planner is making a different estimate of how many deprecated'' versus 
 how many broken  ''. 
 I would try SET STATISTICS to a larger number on the ports table, and 
 re-analyze.

that should not help, as the estimate is accurate, according to the
explain analyze.

gnari



---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PERFORM] index scan of whole table, can't see why

2005-01-20 Thread Ragnar Hafstað
On Wed, 2005-01-19 at 20:37 -0500, Dan Langille wrote:
 Hi folks,
 
 Running on 7.4.2, recently vacuum analysed the three tables in 
 question.
 
 The query plan in question changes dramatically when a WHERE clause 
 changes from ports.broken to ports.deprecated.  I don't see why.  
 Well, I do see why: a sequential scan of a 130,000 rows.  The query 
 goes from 13ms to 1100ms because the of this.  The full plans are at 
 http://rafb.net/paste/results/v8ccvQ54.html
 
 I have tried some tuning by:
 
   set effective_cache_size to 4000, was 1000
   set random_page_cost to 1, was 4
 
 The resulting plan changes, but no speed improvment, are at 
 http://rafb.net/paste/results/rV8khJ18.html
 

this just confirms that an indexscan is not always better than a
tablescan. by setting random_page_cost to 1, you deceiving the
planner into thinking that the indexscan is almost as effective
as a tablescan.

 Any suggestions please?  

did you try to increase sort_mem ?

gnari



---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] [SQL] OFFSET impact on Performance???

2005-01-20 Thread Ragnar Hafstað
On Thu, 2005-01-20 at 19:12 +, Ragnar Hafstað wrote:
 On Thu, 2005-01-20 at 11:59 -0500, Greg Stark wrote:
 
  The best way to do pages for is not to use offset or cursors but to use an
  index. This only works if you can enumerate all the sort orders the
  application might be using and can have an index on each of them.
  
  To do this the query would look something like:
  
  SELECT * FROM tab WHERE col  ? ORDER BY col LIMIT 50
  
  Then you take note of the last value used on a given page and if the user
  selects next you pass that as the starting point for the next page.
 
 this will only work unchanged if the index is unique. imagine , for
 example if you have more than 50 rows with the same value of col.
 
 one way to fix this is to use ORDER BY col,oid

and a slightly more complex WHERE clause as well, of course

gnari



---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] Postgres Optimizer is not smart enough?

2005-01-12 Thread Ragnar Hafstað
On Thu, 2005-01-13 at 12:14 +1300, Mark Kirkwood wrote:

[snip some explains]

 
 I have random_page_cost = 0.8 in my postgresql.conf. Setting it back to
 the default (4) results in a plan using test_id1.

it is not rational to have random_page_cost  1.

if you see improvement with such a setting, it is as likely that 
something else is wrong, such as higher statistic targets needed,
or a much too low effective_cache setting. 

gnari



---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] Howto Increased performace ?

2004-12-27 Thread Ragnar Hafstað
On Mon, 2004-12-27 at 22:31 +0700, Amrit Angsusingh wrote:
  [ [EMAIL PROTECTED] ]
 
  These are some settings that I am planning to start with for a 4GB RAM
  dual
  opteron system with a maximum of 100 connections:
 
 
  shared_buffers 8192 (=67MB RAM)
  sort_mem 4096 (=400MB RAM for 100 connections)
  effective_cache_size 38(@8KB  =3.04GB RAM)
  vacuum_mem 32768 KB
  wal_buffers 64
  checkpoint_segments 8
 
  In theory, effective cache size is the amount of memory left over for the
  OS
  to cache the filesystem after running all programs and having 100 users
  connected, plus a little slack.

 I reduced the connection to 160 and configured as below there is some
 improvement in speed .
 shared_buffers = 27853 [Should I reduce it to nearly as you do and what
 will happen?]

at some point, more shared buffers will do less good than leaving the
memory to the OS to use as disk buffers. you might want to experiment
a bit with different values to find what suits your real-life conditions

 sort_mem = 8192
 vacuum_mem = 16384
 effective_cache_size = 81920 [Should I increase it to more than 20 ?]
as Iain wrote, this value is an indication of how much memory will be
available to the OS for disk cache.
when all other settings have been made, try to see how much memory your
OS has left under normal conditions, and adjust your setting
accordingly, if it differs significantly.
I have seen cases where an incorrect value (too low) influenced the
planner to use sequential scans instead of better indexscans,
presumably because of a higher ratio of estimated cache hits.

 Thanks for any comment again.
 
 NB. There is a huge diaster in my country Tsunamies and all the people
 over the country include me felt into deep sorrow.

my condolescences.

 Amrit Angsusingh
 Thailand

gnari



---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Howto Increased performace ?

2004-12-24 Thread Ragnar Hafstað
On Tue, 2004-12-21 at 16:31 +0700, Amrit Angsusingh wrote:
 I used postgresql 7.3.2-1 with RH 9 on a mechine of 2 Xeon 3.0 Ghz and ram
 of 4 Gb. Since 1 1/2 yr. when I started to use the database server after
 optimizing the postgresql.conf everything went fine until a couple of
 weeks ago , my database grew up to 3.5 Gb and there were more than 140
 concurent connections.
...
 shared_buffers = 25
this is much higher than usually adviced on this list.
try to reduce this to 25000
 
 effective_cache_size = 5000
and increase this instead, to say, 5


gnari



---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] Speed in V8.0

2004-12-24 Thread Ragnar Hafstað
On Wed, 2004-12-22 at 00:03 +0100, Thomas Wegner wrote:
 Hello, i have a problem between V7.4.3 Cygwin and
 V8.0RC2 W2K. I have 2 systems:
 
 1. Production Machine
 - Dual P4 3000MHz
 - 2 GB RAM
 - W2K
 - PostgreSQL 7.4.3 under Cygwin
 - i connect to it over a DSL Line
 2. Develop Machine
 - P4 1800MHz
 - 760 MB RAM
 - PostgreSQL Native Windows
 - local connection 100MB/FD
 
 Both systems use the default postgresql.conf. Now the problem.
 I have an (unoptimized, dynamic) query wich was execute on the
 production machine over DSL in 2 seconds and on my develop
 machine, connected over local LAN, in 119 seconds!

has the development database been ANALYZED ?
 
gnari



---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] Tips for a system with _extremely_ slow IO?

2004-12-21 Thread Ragnar Hafstað
On Fri, 2004-12-17 at 23:51 -0800, Ron Mayer wrote:
 Any advice for settings for extremely IO constrained systems?
 
 A demo I've set up for sales seems to be spending much of it's time in 
 disk wait states.
 
 
 The particular system I'm working with is:
 Ext3 on Debian inside Microsoft VirtualPC on NTFS
 on WindowsXP on laptops of our sales team.

As this is only for demo purposes, you might consider turning fsync off,
although I have no idea if it would have any effect on your setup.

gnari



---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] [NOVICE] \d output to a file

2004-12-15 Thread Ragnar Hafstað
On Wed, 2004-12-15 at 11:50 -0500, Tom Lane wrote:
 Geoffrey [EMAIL PROTECTED] writes:
  sarlav kumar wrote:
  I would like to write the output of the \d command on all tables in a
  database to an output file.
 
  What is the OS?  On any UNIX variant you can do:
  echo '\d' | psql  outputfile
 
 Or use \o:
 
 regression=# \o zzz1
 regression=# \d
or:
=# \d *
to get all tables as th OP wanted

 regression=# \o

gnari



---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]