Re: [PERFORM] bad performance on Solaris 10

2006-04-13 Thread Bruce Momjian
Jignesh K. Shah wrote:
> 
> Bruce,
> 
> Hard to answer that... People like me who know and love PostgreSQL and  
> Solaris finds this as an opportunity to make their favorite database 
> work best on their favorite operating system.
> 
> Many times PostgreSQL has many things based on assumption that it will 
> run on  Linux and it is left to Solaris to emulate that behavior.That 
> said there are ways to improve performance even on UFS on Solaris, it 
> just requires more tweaks.
> 
> Hopefully this will lead to few Solaris friendly default values  like 
> fsync/odatasync :-)

Yes, if someone wants to give us a clear answer on which wal_sync method
is best on all versions of Solaris, we can easily make that change.

-- 
  Bruce Momjian   http://candle.pha.pa.us
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Better index stategy for many fields with few values

2006-04-13 Thread Markus Schaber
Hi, Jim,

Jim C. Nasby wrote:

>>>I was also thinking about about using a functional index.
>>If there's a logical relation between those values that they can easily
>>combined, that may be a good alternative.
> How would that be any better than just doing a multi-column index?

10 different values per column, and 20 columns are 10^20 value combinations.

Partitioning it for the first column gives 10^19 combinations which is
smaller than 2^64, and thus fits into a long value.

And I just guess that a 10-partition functional index on a long value
could perform better than a multi-column index on 20 columns of
character(10), if only because it is approx. 1/25th in size.

HTH,
Markus
-- 
Markus Schaber | Logical Tracking&Tracing International AG
Dipl. Inf. | Software Development GIS

Fight against software patents in EU! www.ffii.org www.nosoftwarepatents.org

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] Better index stategy for many fields with few values

2006-04-13 Thread Markus Schaber
Hi, Jim,

Jim Nasby wrote:
> Adding -performance back in
> I would like to try it.
> 
> However in an other post I added that contrary to what I stated
> initially all the paramXX columns are not mandatory in the query. So
> it seems that requirement make the problem more complexe.

Okay, this rules out my functional index over 19 columns.

> Doesn't this new requirement rule out this solution? 
> 
> No, just group the columns logically.

Yes, that's the solution.

If you have common groups of columns that appear and disappear
synchroneously, pack those together in an (possibly partitioned and/or
functional) index.

Then rely on the query planner that the combines the appropriate indices
via index bitmap scan.

>  By the way I have test to index each column individually and check
> what happens in relation to bitscan map. My test table  is 1
> million  rows. The explain analyze command shows that a bit scan is
> sometimes used but I still end up with queries that can take up to
> 10s which is way to much.

Is it on the first query, or on repeated queries?

It might be that you're I/O bound, and the backend has to fetch indices
and rows from Disk into RAM.

I currently don't know whether the order of indices in a multi-index
bitmap scan is relevant, but I could imagine that it may be useful to
have the most selective index scanned first.

And keep in mind that, assuming an equal distribution of your
parameters, every index bitmap hits 1/10th of the whole table on
average, so the selectivity generally is low.

The selectivity of a partitioned 3-column index will be much better
(about 1/1th of the whole table), and less index scans and bitmaps
have to be generated.

A functional index may also make sense to CLUSTER the table to optimize
the locality of search results (and so reducing disk I/O). In case your
table has low write activity, but high read-only activity, the overhead
that comes with the additional index is neglible compared to the
performance improvement proper CLUSTERing can generate.

Markus
-- 
Markus Schaber | Logical Tracking&Tracing International AG
Dipl. Inf. | Software Development GIS

Fight against software patents in EU! www.ffii.org www.nosoftwarepatents.org

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[PERFORM] Slow query - possible bug?

2006-04-13 Thread Gavin Hamill
laterooms=# explain analyze select allocation0_."ID" as y1_, 
allocation0_."RoomID" as y2_, allocation0_."StatusID" as y4_, 
allocation0_."Price" as y3_, allocation0_."Number" as y5_, 
allocation0_."Date" as y6_ from "Allocation" allocation0_ where 
(allocation0_."Date" between '2006-06-09 00:00:00.00' and 
'2006-06-09 00:00:00.00')and(allocation0_."RoomID" in(4300591));

   QUERY PLAN
--
Index Scan using ix_date on "Allocation" allocation0_  (cost=0.00..4.77 
rows=1 width=34) (actual time=1411.325..1689.860 rows=1 loops=1)
  Index Cond: (("Date" >= '2006-06-09'::date) AND ("Date" <= 
'2006-06-09'::date))

  Filter: ("RoomID" = 4300591)
Total runtime: 1689.917 ms
(4 rows)

Yep, the two dates are identical - yep I would change the client 
software to do where "Date" = '2006-06-09 00:00:00.00' if I could...


However, it's clear to see why this simple query is taking so long - the 
plan is selecting /all/ dates after 2006-06-09 and /all/ dates before 
then, and only returning the union of the two - a large waste of effort, 
surely?


VACUUM ANALYZE hasn't improved matters... the schema for the table is

"ID" int8 NOT NULL DEFAULT 
nextval(('public."allocation_id_seq"'::text)::regclass),

 "RoomID" int4,
 "Price" numeric(10,2),
 "StatusID" int4,
 "Number" int4,
 "Date" date,

and there are indexes kept for 'RoomID' and 'Date' in this 4.3-million 
row table.


Is this a bug or a hidden feature in pg 8.1.3 ? :)

Cheers,
Gavin.


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Slow query - possible bug?

2006-04-13 Thread chris smith
On 4/13/06, Gavin Hamill <[EMAIL PROTECTED]> wrote:
> laterooms=# explain analyze select allocation0_."ID" as y1_,
> allocation0_."RoomID" as y2_, allocation0_."StatusID" as y4_,
> allocation0_."Price" as y3_, allocation0_."Number" as y5_,
> allocation0_."Date" as y6_ from "Allocation" allocation0_ where
> (allocation0_."Date" between '2006-06-09 00:00:00.00' and
> '2006-06-09 00:00:00.00')and(allocation0_."RoomID" in(4300591));
> QUERY PLAN
> --
>  Index Scan using ix_date on "Allocation" allocation0_  (cost=0.00..4.77
> rows=1 width=34) (actual time=1411.325..1689.860 rows=1 loops=1)
>Index Cond: (("Date" >= '2006-06-09'::date) AND ("Date" <=
> '2006-06-09'::date))
>Filter: ("RoomID" = 4300591)
>  Total runtime: 1689.917 ms
> (4 rows)

1.6secs isn't too bad on 4.3mill rows...

How many entries are there for that date range?

--
Postgresql & php tutorials
http://www.designmagick.com/

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] Slow query - possible bug?

2006-04-13 Thread Gavin Hamill

chris smith wrote:


1.6secs isn't too bad on 4.3mill rows...

How many entries are there for that date range?
 

1.7 secs /is/ good - it typically takes 5 or 6 seconds, which isn't so 
good. My question is 'why does the planner choose such a bizarre range 
request when both elements of the 'between' are identical? :)'


If I replace the
(allocation0_."Date" between '2006-06-09 00:00:00.00' and 
'2006-06-09 00:00:00.00')


with

allocation0_."Date" ='2006-04-09 00:00:00.00'

then the query comes back in a few milliseconds (as I'd expect :) - and 
yup I've been using different dates for each test to avoid the query 
being cached.


For ref, there are typically 35000 rows per date :)

Cheers,
Gavin.


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] Slow query - possible bug?

2006-04-13 Thread Richard Huxton

Gavin Hamill wrote:

chris smith wrote:


1.6secs isn't too bad on 4.3mill rows...

How many entries are there for that date range?
 

1.7 secs /is/ good - it typically takes 5 or 6 seconds, which isn't so 
good. My question is 'why does the planner choose such a bizarre range 
request when both elements of the 'between' are identical? :)'


What's bizarre about the range request, and are you sure it's searching 
doing the union of both conditions separately? It looks to me like it's 
doing a standard range-search. If it was trying to fetch 4.3 million 
rows via that index, I'd expect it to use a different index instead.


If you've got stats turned on, look in pg_stat_user_indexes/tables 
before and after the query to see. Here's an example of a similar query 
against one of my log tables. It's small, but the clause is the same, 
and I don't see any evidence of the whole table being selected.


lamp=> SELECT * FROM pg_stat_user_indexes WHERE relname LIKE 'act%';
  relid  | indexrelid | schemaname | relname |  indexrelname  | 
idx_scan | idx_tup_read | idx_tup_fetch

-+++-++--+--+---
 6124993 |7519044 | public | act_log | act_log_ts_idx | 
23 |   18 |18
 6124993 |7371115 | public | act_log | act_log_pkey   | 
0 |0 | 0

(2 rows)

lamp=> EXPLAIN ANALYSE SELECT * FROM act_log WHERE al_ts BETWEEN 
'2006-04-05 14:10:23+00'::timestamptz AND '2006-04-05 
14:10:23+00'::timestamptz;


QUERY PLAN
-
 Index Scan using act_log_ts_idx on act_log  (cost=0.00..3.02 rows=1 
width=102) (actual time=0.116..0.131 rows=1 loops=1)
   Index Cond: ((al_ts >= '2006-04-05 15:10:23+01'::timestamp with time 
zone) AND (al_ts <= '2006-04-05 15:10:23+01'::timestamp with time zone))

 Total runtime: 0.443 ms
(3 rows)

lamp=> SELECT * FROM pg_stat_user_indexes WHERE relname LIKE 'act%';
  relid  | indexrelid | schemaname | relname |  indexrelname  | 
idx_scan | idx_tup_read | idx_tup_fetch

-+++-++--+--+---
 6124993 |7519044 | public | act_log | act_log_ts_idx | 
24 |   19 |19
 6124993 |7371115 | public | act_log | act_log_pkey   | 
0 |0 | 0

(2 rows)


1. vacuum full verbose your table (and post the output please)
2. perhaps reindex?
3. Try the explain analyse again and see what happens.
--
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Better index stategy for many fields with few values

2006-04-13 Thread Markus Schaber
Hi, Oscar,

Please reply to the list and not privately, so others can learn from
your replies, and possibly have better Ideas than me.

Oscar Picasso wrote:

> I cannot group the columns logically. Any column may or may not appear
> in a query.

That's suboptimal.

> Summrarizing what I have learned:
> - I cannot use multicolumn indexes because I cannot group the column
> logically.
> - I cannot use funtional indexes
> - I cannot use clustering.

You still can have a set of partitioned multi-column indices,
overlapping enough that every combination of columns is covered (or risk
a sequential sub scan for the last two or three columns, this should not
hurt too much if the first 17 columns were selective enough).

The main problem with indices is that they also decrease write performance.

If disk costs are not limited, it will make sense to have WAL, table and
indices on different disks / raid arrays, to parallelize writes.

Btw, I guess you have multiple, concurrent users?

Markus
-- 
Markus Schaber | Logical Tracking&Tracing International AG
Dipl. Inf. | Software Development GIS

Fight against software patents in EU! www.ffii.org www.nosoftwarepatents.org

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] pgmemcache

2006-04-13 Thread Markus Schaber
Hi, Tom,

Tom Lane wrote:

>>Why are AFTER COMMIT triggers impossible?
> 
> What happens if such a trigger gets an error?  You can't un-commit.

Then it must be specified that those triggers are in their own
transaction, and cannot abort the transaction.

Or use the 2-phase-commit infrastructure for them.

Markus
-- 
Markus Schaber | Logical Tracking&Tracing International AG
Dipl. Inf. | Software Development GIS

Fight against software patents in EU! www.ffii.org www.nosoftwarepatents.org

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] pg 8.1.3, AIX, huge box, painfully slow.

2006-04-13 Thread Gavin Hamill

Tom Lane wrote:


Gavin Hamill <[EMAIL PROTECTED]> writes:
 


would a simple  "#define LWLOCK_PADDED_SIZE  128" be sufficient?
   



Yeah, that's fine.
 



OK I tried that but noticed no real improvement... in the interim I've 
installed Debian on the pSeries (using 
http://debian.gonicus.de/debian/dists/sarge/main/disks-powerpc/current/pseries/install.txt 
) and using a simple load-test script - it picks a 'hotelsearch' select 
at random from a big file and just does a pg_query on that via PHP...


Using apachebench with 10 clients gave a loadavg of about 10 after a few 
minutes, and the logs showed typical query times of 8 seconds. Again, no 
disk activity, normal context-switching, just full-out CPU usage...


We're improving the quality + efficiency of the hotelsearch function all 
the time (Simon will certainly be able to vouch for its complexity) - am 
really uncertain what to do next tho! :/


Cheers,
Gavin.


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


[PERFORM] index is not used if I include a function that returns current time in my query

2006-04-13 Thread Cristian Veronesi

Hello, postgresql 7.4.8 on SuSE Linux here.

I have a table called DMO with a column called ORA_RIF defined as
"timestamp without time zone" ;

I created an index on this table based on this column only.

If I run a query against a text literal the index is used:


explain select * from dmo where ora_rif>'2006-01-01';

  QUERY PLAN
-
 Index Scan using dmo_ndx02 on dmo  (cost=0.00..1183.23 rows=736 width=156)
   Index Cond: (ora_rif > '2006-01-01 00:00:00'::timestamp without time
zone)

If I try to use a function that returns the current time instead, a
sequential scan is always performed:


explain select * from dmo where ora_rif>localtimestamp;

  QUERY PLAN
--
 Seq Scan on dmo  (cost=0.00..1008253.22 rows=2703928 width=156)
   Filter: (ora_rif > ('now'::text)::timestamp(6) without time zone)

explain select * from dmo where ora_rif>localtimestamp::timestamp 

without time zone;
  QUERY PLAN
--
 Seq Scan on dmo  (cost=0.00..1008253.22 rows=2703928 width=156)
   Filter: (ora_rif > ('now'::text)::timestamp(6) without time zone)

... etc. ...

(tried with all datetime functions with and without cast)

I even tried to write a function that explicitly returns a "timestamp
without time zone" value:

create or replace function f () returns timestamp without time zone
as '
declare
  x timestamp without time zone ;
begin
  x := ''2006-01-01 00:00:00'';
  return x ;
end ;
' language plpgsql ;

But the result is the same:


explain select * from dmo ora_rif>f();

 QUERY PLAN
-
 Seq Scan on dmo  (cost=0.00..987973.76 rows=2703928 width=156)
   Filter: (ora_rif > f())

Any suggestion?

Kind regards,

--
Cristian Veronesi - C.R.P.A. S.p.A. - Reggio Emilia, Italy

The first thing you need to learn about databases is that
they are not just a fancy file system for storing data.




---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Slow query - possible bug?

2006-04-13 Thread Tom Lane
Gavin Hamill <[EMAIL PROTECTED]> writes:
> If I replace the
> (allocation0_."Date" between '2006-06-09 00:00:00.00' and 
> '2006-06-09 00:00:00.00')
> with
> allocation0_."Date" ='2006-04-09 00:00:00.00'
> then the query comes back in a few milliseconds (as I'd expect :)

Could we see EXPLAIN ANALYZE for
* both forms of the date condition, with the roomid condition;
* both forms of the date condition, WITHOUT the roomid condition;
* just the roomid condition

I'm thinking the planner is misestimating something, but it's hard
to tell what without breaking it down.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Better index stategy for many fields with few values

2006-04-13 Thread Oscar Picasso
Hi Markus,Markus Schaber <[EMAIL PROTECTED]> wrote:>Hi, Oscar,>>Please reply to the list and not privately, so others can learn from>your replies, and possibly have better Ideas than me.That was my intention. I made a mistake.>Oscar Picasso wrote:>>> I cannot group the columns logically. Any column may or may not appear>> in a query.>>That's suboptimal.>>> Summrarizing what I have learned:>> - I cannot use multicolumn indexes because I cannot group the column>> logically.>> - I cannot use funtional indexes>> - I cannot use clustering.>>You still can have a set of partitioned multi-column indices,>overlapping enough that every combination of columns is covered (or risk>a sequential sub scan for the last two or three columns, this should not>hurt too much if the first 17 columns
 were selective enough).>>The main problem with indices is that they also decrease write performance.>>If disk costs are not limited, it will make sense to have WAL, table and>indices on different disks / raid arrays, to parallelize writes.>>Btw, I guess you have multiple, concurrent users?Yes I do.I have just made other tests with only the individual indexes and performance is much better than previously. Obviously there was an I/O problem during my initial test.Something interesting though. If I use few columns in the query the results come very quickly and pg does a sequential scan. When it reachs some threshold (4 or 5 columns) pg switches to bitmap scans. It then takes an almost constant time (~ 2500 ms) not matter how many more columns I add to the where clause.Interestingly enough, queries with many columns are less common. They also return less results and even many times no
 result at all. From the user point of view it would be nice to have a waiting time lower than 2500ms for these queries. Maybe I could achieve that goal simply by tuning postgresql. In a such case where should I look first in order to increase bitmap scanning? Maybe I could, that way, avoid the use of partitioned multi-column indexes.Oscar
		Talk is cheap. Use Yahoo! Messenger to make PC-to-Phone calls.  Great rates starting at 1ยข/min.

Re: [PERFORM] index is not used if I include a function that returns current time in my query

2006-04-13 Thread Tom Lane
Cristian Veronesi <[EMAIL PROTECTED]> writes:
> If I try to use a function that returns the current time instead, a
> sequential scan is always performed:
> ...
> Any suggestion?

1. Use something newer than 7.4 ;-)

2. Set up a dummy range constraint, ie

select ... where ora_rif > localtimestamp and ora_rif < 'infinity';

The problem you have is that the planner doesn't know the value of the
function and falls back to a default assumption about the selectivity of
the '>' condition --- and that default discourages indexscans.  (Note
the very large estimate of number of rows returned.)  In the
range-constraint situation, the planner still doesn't know the value of
the function, but its default assumption for a range constraint is
tighter and it (probably) will choose an indexscan.

Since PG 8.0, the planner understands that it's reasonable to
pre-evaluate certain functions like localtimestamp to obtain
better-than-guess values about selectivity, so updating would
be a better fix.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] multi column query

2006-04-13 Thread Jim Nasby
You need to run EXPLAIN ANALYZE. Also, what's random_page_cost set to? And the 
output of \d chkpfw_tr_dy_dimension. The cost for that index scan looks way too 
high.

And please reply-all so that the list is included.

> -Original Message-
> From: Sriram Dandapani [mailto:[EMAIL PROTECTED]
> Sent: Wednesday, April 12, 2006 7:48 PM
> To: Jim Nasby
> Subject: RE: [PERFORM] multi column query
> 
> 
> I executed enable_seqscan=off and then ran an explain plan on 
> the query
> 
> UPDATE chkpfw_tr_dy_dimension
>SET summcount = a.summcount + b.summcount,
>bytes = a.bytes + b.bytes,
>duration = a.duration + b.duration
>from chkpfw_tr_dy_dimension a,
> c_chkpfw_dy_tr_updates b
>WHERE a.firstoccurrence = b.firstoccurrence
>   AND a.customerid_id = b.customerid_id
>AND a.sentryid_id = b.sentryid_id
>   AND a.node_id = b.node_id
>AND a.interface_id = b.interface_id
>AND a.source_id = b.source_id
>AND a.destination_id = b.destination_id
>AND a.sourceport_id = b.sourceport_id
>AND a.destinationport_id = b.destinationport_id
>AND a.inoutbound_id = b.inoutbound_id
>AND a.action_id = b.action_id
>AND a.protocol_id = b.protocol_id
>AND a.service_id = b.service_id
>AND a.sourcezone_id = b.sourcezone_id
>AND a.destinationzone_id =
> b.destinationzone_id;
> 
> 
> 
> Here is the query plan
> 
> 
> "Nested Loop  (cost=20036.18..221851442.39 rows=1 width=166)"
> "  ->  Merge Join  (cost=10036.18..121620543.75 rows=1 width=96)"
> "Merge Cond: (("outer".firstoccurrence =
> "inner".firstoccurrence) AND ("outer".sentryid_id = 
> "inner".sentryid_id)
> AND ("outer".node_id = "inner".node_id))"
> "Join Filter: (("outer".customerid_id = "inner".customerid_id)
> AND ("outer".interface_id = "inner".interface_id) AND 
> ("outer".source_id
> = "inner".source_id) AND ("outer".destination_id =
> "inner".destination_id) AND ("outer".sourceport_id = "inner".s (..)"
> "->  Index Scan using chkpfw_tr_dy_idx1 on
> chkpfw_tr_dy_dimension a  (cost=0.00..21573372.84 rows=6281981
> width=88)"
> "->  Sort  (cost=10036.18..10037.38 rows=480 
> width=136)"
> "  Sort Key: b.firstoccurrence, b.sentryid_id, b.node_id"
> "  ->  Seq Scan on c_chkpfw_dy_tr_updates b
> (cost=1.00..10014.80 rows=480 width=136)"
> "  ->  Seq Scan on chkpfw_tr_dy_dimension
> (cost=1.00..100168078.81 rows=6281981 width=70)"
> 
> -Original Message-
> From: Jim C. Nasby [mailto:[EMAIL PROTECTED] 
> Sent: Wednesday, April 12, 2006 5:44 PM
> To: Sriram Dandapani
> Cc: pgsql-performance@postgresql.org
> Subject: Re: [PERFORM] multi column query
> 
> On Wed, Apr 12, 2006 at 05:32:32PM -0700, Sriram Dandapani wrote:
> > Hi
> > 
> > When I update a table that has 20 columns and the where clause
> includes
> > 16 of the columns (this is a data warehousing type update 
> on aggregate
> > fields),
> > 
> > The bitmap scan is not used by the optimizer. The table is 
> indexed on
> 3
> > of the 20 fields. The update takes really long to finish (on a 6
> million
> > row table)
> > 
> > Do I need to do some "magic" with configuration to turn on bitmap
> scans.
> 
> No. What's explain analyze of the query show? What's it doing now?
> Seqscan? You might try set enable_seqscan=off and see what that does.
> -- 
> Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
> Pervasive Software  http://pervasive.comwork: 512-231-6117
> vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461
> 

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] pgmemcache

2006-04-13 Thread Tom Lane
Christian Storm <[EMAIL PROTECTED]> writes:
> Not sure if I follow why this is a problem.  Seems like it would be  
> beneficial to have both BEFORE and AFTER COMMIT triggers.
> With the BEFORE COMMIT trigger you would have the ability to 'un- 
> commit' (rollback) the transaction.  With
> the AFTER COMMIT trigger you wouldn't have that option because the  
> commit has already been successful.  However,
> with an AFTER COMMIT you would be able to trigger other downstream  
> events that rely on a transaction successfully committing.

An AFTER COMMIT trigger would have to be in a separate transaction.
What happens if there's more than one, and one of them fails?  Even
more to the point, if it's a separate transaction, don't you have
to fire all these triggers again when you commit that transaction?
The idea seems circular.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] bad performance on Solaris 10

2006-04-13 Thread Robert Lor

Bruce Momjian wrote On 04/13/06 01:39 AM,:
> 
> Yes, if someone wants to give us a clear answer on which wal_sync method
> is best on all versions of Solaris, we can easily make that change.
> 

We're doing tests to see how various parameters in postgresql.conf
affect performance on Solaris and will share the results shortly.

Regards,
-Robert


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[PERFORM] Blocks read for index scans

2006-04-13 Thread Jim Nasby
While working on determining a good stripe size for a database, I  
realized it would be handy to know what the average request size is.  
Getting this info is a simple matter of joining pg_stat_all_tables  
and pg_statio_all_tables and doing some math, but there's one issue  
I've found; it appears that there's no information on how many heap  
blocks were read in by an index scan. Is there any way to get that info?

--
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] pgmemcache

2006-04-13 Thread Jim Nasby

On Apr 13, 2006, at 12:38 PM, Tom Lane wrote:


Christian Storm <[EMAIL PROTECTED]> writes:

Not sure if I follow why this is a problem.  Seems like it would be
beneficial to have both BEFORE and AFTER COMMIT triggers.
With the BEFORE COMMIT trigger you would have the ability to 'un-
commit' (rollback) the transaction.  With
the AFTER COMMIT trigger you wouldn't have that option because the
commit has already been successful.  However,
with an AFTER COMMIT you would be able to trigger other downstream
events that rely on a transaction successfully committing.


An AFTER COMMIT trigger would have to be in a separate transaction.
What happens if there's more than one, and one of them fails?  Even
more to the point, if it's a separate transaction, don't you have
to fire all these triggers again when you commit that transaction?
The idea seems circular.


I suspect that in reality you'd probably want each on-commit trigger  
to be it's own transaction, but it depends on what you're doing.  
Also, I can't see any use for them where you'd actually be  
interacting with the database, only if you were calling something  
externally via a function. One example would be sending an email out  
when a certain table changes; in many cases it's better to let the  
change happen even if the email can't be sent, and you'd rather not  
send an email if the transaction just ends up rolling back for some  
reason. And yes, you'd have to ensure you didn't code yourself up a  
trigger loop.

--
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461



---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] Inserts optimization?

2006-04-13 Thread Francisco Reyes

Chris writes:


If you can, use copy instead:
http://www.postgresql.org/docs/8.1/interactive/sql-copy.html


I am familiar with copy.
Can't use it in this scenario.

The data is coming from a program called Bacula (Backup server).
It is not static data.

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Blocks read for index scans

2006-04-13 Thread Jim Nasby
Adding -performance back in...

> From: Steve Poe [mailto:[EMAIL PROTECTED]
> Jim,
> 
> I could be way off, but doesn't from pg_statio_user_tables 
> contain this
> information?

http://www.postgresql.org/docs/8.1/interactive/monitoring-stats.html#MONITORING-STATS-VIEWS
 states:

"numbers of disk blocks read and buffer hits in all indexes of that table"

That leads me to believe that it's only tracking index blocks read, and not 
heap blocks read. One could presume that each index row read as reported by 
pg_stat_all_tables would represent a heap block read, but a large number of 
those would (hopefully) have already been in shared_buffers.

> On Thu, 2006-04-13 at 13:00 -0500, Jim Nasby wrote:
> > While working on determining a good stripe size for a database, I  
> > realized it would be handy to know what the average request 
> size is.  
> > Getting this info is a simple matter of joining pg_stat_all_tables  
> > and pg_statio_all_tables and doing some math, but there's 
> one issue  
> > I've found; it appears that there's no information on how 
> many heap  
> > blocks were read in by an index scan. Is there any way to 
> get that info?
--
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Inserts optimization?

2006-04-13 Thread Francisco Reyes

Tom Lane writes:


Or at least try to do multiple inserts per transaction.


Will see if the program has an option like that.



Also, increasing checkpoint_segments and possibly wal_buffers helps a


Will try those.


Try to get the WAL onto a separate disk
spindle if you can.  (These things don't matter for SELECTs, but they
do matter for writes.)


This particular server is pretty much what I inherited for now for this 
project.and its Raid 5. There is a new server I am setting up 
soon... 8 disks which we are planning to setup

6 disks in RAID 10
2 Hot spares

In RAID 10 would it matter that WALL is in the same RAID set?
Would it be better:
4 disks in RAID10 Data
2 disks RAID 1 WALL
2 hot spares

All in the same RAID controller

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] bad performance on Solaris 10

2006-04-13 Thread Merlin Moncure
On 4/12/06, Josh Berkus  wrote:
> People,
>
> > Lately I find people are not so receptive to VxFS, and Sun is promoting
> > ZFS, and we don't have a reasonable near term option for Raw IO in
> > Postgres, so we need to work to find a reasonable path for Solaris users
> > IMO. The long delays in ZFS production haven't helped us there, as the
> > problems with UFS are severe.

I just recently worked with sun solaris 10 and found it to be
reasonably performant without much tuning.  This was on a dual sparc
sunblade workstation which i felt was very well engineered.  I was
able (with zero solaris experience) to get postgresql up and crunching
away at some really data intensive tasks while running an application
compiled their very excellent fortran compiler.

In the enterprise world I am finding that the only linux distrubutions
supported are redhat and suse, meaning if you have a problem with your
san running against your gentoo box you have a serious problem.
Solaris OTOH, is generally very well supported (especially on sun
hardware) and is free.  So I give sun great credit for providing a
free if not necessarily completely open platform for developing open
source applications in an enterprise environment.

Merlin

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] multi column query

2006-04-13 Thread Sriram Dandapani
Hi Jim

The problem is fixed. The destination table that was being updated had 3
separate indexes. I combined them to a multi-column index and the effect
was amazing.
Thanks for your input

Sriram

-Original Message-
From: Jim Nasby [mailto:[EMAIL PROTECTED] 
Sent: Thursday, April 13, 2006 9:42 AM
To: Sriram Dandapani
Cc: Pgsql-Performance (E-mail)
Subject: RE: [PERFORM] multi column query

You need to run EXPLAIN ANALYZE. Also, what's random_page_cost set to?
And the output of \d chkpfw_tr_dy_dimension. The cost for that index
scan looks way too high.

And please reply-all so that the list is included.

> -Original Message-
> From: Sriram Dandapani [mailto:[EMAIL PROTECTED]
> Sent: Wednesday, April 12, 2006 7:48 PM
> To: Jim Nasby
> Subject: RE: [PERFORM] multi column query
> 
> 
> I executed enable_seqscan=off and then ran an explain plan on 
> the query
> 
> UPDATE chkpfw_tr_dy_dimension
>SET summcount = a.summcount + b.summcount,
>bytes = a.bytes + b.bytes,
>duration = a.duration + b.duration
>from chkpfw_tr_dy_dimension a,
> c_chkpfw_dy_tr_updates b
>WHERE a.firstoccurrence = b.firstoccurrence
>   AND a.customerid_id = b.customerid_id
>AND a.sentryid_id = b.sentryid_id
>   AND a.node_id = b.node_id
>AND a.interface_id = b.interface_id
>AND a.source_id = b.source_id
>AND a.destination_id = b.destination_id
>AND a.sourceport_id = b.sourceport_id
>AND a.destinationport_id = b.destinationport_id
>AND a.inoutbound_id = b.inoutbound_id
>AND a.action_id = b.action_id
>AND a.protocol_id = b.protocol_id
>AND a.service_id = b.service_id
>AND a.sourcezone_id = b.sourcezone_id
>AND a.destinationzone_id =
> b.destinationzone_id;
> 
> 
> 
> Here is the query plan
> 
> 
> "Nested Loop  (cost=20036.18..221851442.39 rows=1 width=166)"
> "  ->  Merge Join  (cost=10036.18..121620543.75 rows=1 width=96)"
> "Merge Cond: (("outer".firstoccurrence =
> "inner".firstoccurrence) AND ("outer".sentryid_id = 
> "inner".sentryid_id)
> AND ("outer".node_id = "inner".node_id))"
> "Join Filter: (("outer".customerid_id = "inner".customerid_id)
> AND ("outer".interface_id = "inner".interface_id) AND 
> ("outer".source_id
> = "inner".source_id) AND ("outer".destination_id =
> "inner".destination_id) AND ("outer".sourceport_id = "inner".s (..)"
> "->  Index Scan using chkpfw_tr_dy_idx1 on
> chkpfw_tr_dy_dimension a  (cost=0.00..21573372.84 rows=6281981
> width=88)"
> "->  Sort  (cost=10036.18..10037.38 rows=480 
> width=136)"
> "  Sort Key: b.firstoccurrence, b.sentryid_id, b.node_id"
> "  ->  Seq Scan on c_chkpfw_dy_tr_updates b
> (cost=1.00..10014.80 rows=480 width=136)"
> "  ->  Seq Scan on chkpfw_tr_dy_dimension
> (cost=1.00..100168078.81 rows=6281981 width=70)"
> 
> -Original Message-
> From: Jim C. Nasby [mailto:[EMAIL PROTECTED] 
> Sent: Wednesday, April 12, 2006 5:44 PM
> To: Sriram Dandapani
> Cc: pgsql-performance@postgresql.org
> Subject: Re: [PERFORM] multi column query
> 
> On Wed, Apr 12, 2006 at 05:32:32PM -0700, Sriram Dandapani wrote:
> > Hi
> > 
> > When I update a table that has 20 columns and the where clause
> includes
> > 16 of the columns (this is a data warehousing type update 
> on aggregate
> > fields),
> > 
> > The bitmap scan is not used by the optimizer. The table is 
> indexed on
> 3
> > of the 20 fields. The update takes really long to finish (on a 6
> million
> > row table)
> > 
> > Do I need to do some "magic" with configuration to turn on bitmap
> scans.
> 
> No. What's explain analyze of the query show? What's it doing now?
> Seqscan? You might try set enable_seqscan=off and see what that does.
> -- 
> Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
> Pervasive Software  http://pervasive.comwork: 512-231-6117
> vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461
> 

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] pgmemcache

2006-04-13 Thread PFC



An AFTER COMMIT trigger would have to be in a separate transaction.


	I guess AFTER COMMIT triggers would be like a NOTIFY, but more powerful.  
While NOTIFY can't transmit information to another process, this trigger  
could, and the other process could then view the results of the commited  
transaction.
	Also, implementing a long process (something involving network  
roundtrips, for instance) in a BEFORE COMMIT trigger would delay the  
transaction and any locks it holds with no benefit.



What happens if there's more than one, and one of them fails?


Each one in its own transaction ?


Even more to the point, if it's a separate transaction, don't you have
to fire all these triggers again when you commit that transaction?
The idea seems circular.


	I guess AFTER COMMIT triggers are most useful when coupled to a trigger  
on a modification to a table. So, the "before / after commit" could be an  
attribute of an AFTER INSERT/UPDATE/DELETE trigger. If the AFTER COMMIT  
trigger doesn't do any modifications to the target table, there will be no  
infinite loop.


	The before/after commit could also be implemented not via triggers, but  
via deferred actions, by telling postgres to execute a specific query just  
before/after the transaction commits. This could be used to implement the  
triggers, but would also be more generic : a trigger on INSERT could then  
defer a call to memcache update once the transaction is commited. It gets  
lisp-ish, but it would be really cool?


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] pgmemcache

2006-04-13 Thread Greg Stark
Tom Lane <[EMAIL PROTECTED]> writes:

> Christian Storm <[EMAIL PROTECTED]> writes:
> > Not sure if I follow why this is a problem.  Seems like it would be  
> > beneficial to have both BEFORE and AFTER COMMIT triggers.
> > With the BEFORE COMMIT trigger you would have the ability to 'un- 
> > commit' (rollback) the transaction.  With
> > the AFTER COMMIT trigger you wouldn't have that option because the  
> > commit has already been successful.  However,
> > with an AFTER COMMIT you would be able to trigger other downstream  
> > events that rely on a transaction successfully committing.
> 
> An AFTER COMMIT trigger would have to be in a separate transaction.
> What happens if there's more than one, and one of them fails?  Even
> more to the point, if it's a separate transaction, don't you have
> to fire all these triggers again when you commit that transaction?
> The idea seems circular.

Maybe it just means they would have to be limited to not making any database
modifications. Ie, all they can do is notify the outside world that the
transaction committed. 

Presumably if you wanted to make any database modifications you would just do
it in the transaction anyways since they wouldn't show up until the
transaction commits.

-- 
greg


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Blocks read for index scans

2006-04-13 Thread Tom Lane
Jim Nasby <[EMAIL PROTECTED]> writes:
> While working on determining a good stripe size for a database, I  
> realized it would be handy to know what the average request size is.  
> Getting this info is a simple matter of joining pg_stat_all_tables  
> and pg_statio_all_tables and doing some math, but there's one issue  
> I've found; it appears that there's no information on how many heap  
> blocks were read in by an index scan. Is there any way to get that info?

If the table is otherwise idle, the change in the table's entry in
pgstatio_all_tables should do, no?

(This is as of 8.1 ... older versions acted a bit differently IIRC.)

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] pgmemcache

2006-04-13 Thread Tom Lane
PFC <[EMAIL PROTECTED]> writes:
>   I guess AFTER COMMIT triggers would be like a NOTIFY, but more 
> powerful.  

I'll let you in on a secret: NOTIFY is actually a before-commit
operation.  This is good enough because it never, or hardly ever,
fails.  I would argue that anything you want to do in an AFTER COMMIT
trigger could just as well be done in a BEFORE COMMIT trigger; if that's
not reliable enough then you need to fix your trigger.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] pg 7.4.x - pg_restore impossibly slow

2006-04-13 Thread patrick keshishian
Hi Tom, et.al.,

So I changed the following settings in postgresql.conf file and
restarted PostgreSQL and then proceeded with pg_restore:

# new changes for this test-run
log_statement = true
sort_mem = 10240 # default 1024
vacuum_mem = 20480 # default 8192
# from before
checkpoint_segments = 10
log_pid = true
log_timestamp = true

With these settings and running:

pg_restore -vaOd dbname dbname.DUMP


Things seem to progress better. The first of the large
tables got COPY'ed within 1 hr 40 min:

start: 2006-04-13 11:44:19
finish: 2006-04-13 13:25:36

I ended up ctrl-C'ing out of the pg_restore as the second
large table was taking over 3 hours and the last PostgreSQL
log entry was from over 2.5hrs ago, with message:

2006-04-13 14:09:29 [3049] LOG:  recycled transaction log file
"0006006B"

Time for something different. Before attempting the same
procedure with fsync off, I ran the following sequence of
commands:

$ dropdb dbname
$ createdb dbname
$ pg_restore -vsOd dbname dbname.DUMP
$ date > db.restore ; pg_restore -vcOd dbname \
dbname.DUMP ; date >> db.restore
$ cat db.restore
Thu Apr 13 18:02:51 PDT 2006
Thu Apr 13 18:17:16 PDT 2006

That's just over 14 minutes!

Ideas?

Is this because the -c option drops all foreign keys and
so the restore goes faster?  Should this be the preferred,
recommended and documented method to run pg_restore?
Any drawbacks to this method?

Thanks,
--patrick




On 4/12/06, Tom Lane <[EMAIL PROTECTED]> wrote:
> "patrick keshishian" <[EMAIL PROTECTED]> writes:
> > My dev box is much slower hardware than the customer's
> > server.  Even with that difference I expected to be able to
> > pg_restore the database within one day.  But no.
>
> Seems a bit odd.  Can you narrow down more closely which step of the
> restore is taking the time?  (Try enabling log_statements.)
>
> One thought is that kicking up work_mem and vacuum_mem is likely to
> help for some steps (esp. CREATE INDEX and foreign-key checking).
> And be sure you've done the usual tuning for write-intensive activity,
> such as bumping up checkpoint_segments.  Turning off fsync wouldn't
> be a bad idea either.
>
> regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] pg 7.4.x - pg_restore impossibly slow

2006-04-13 Thread Tom Lane
"patrick keshishian" <[EMAIL PROTECTED]> writes:
> With these settings and running:
> pg_restore -vaOd dbname dbname.DUMP

If you had mentioned you were using random nondefault switches, we'd
have told you not to.  -a in particular is a horrid idea performancewise
--- a standard schema-plus-data restore goes way faster because it's
doing index builds and foreign key checks wholesale instead of
incrementally.

> Is this because the -c option drops all foreign keys and
> so the restore goes faster?  Should this be the preferred,
> recommended and documented method to run pg_restore?

It is documented in recent versions of the documentation: see
http://www.postgresql.org/docs/8.1/static/populate.html
particularly the last section.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Inserts optimization?

2006-04-13 Thread Jim C. Nasby
On Thu, Apr 13, 2006 at 02:59:23PM -0400, Francisco Reyes wrote:
> In RAID 10 would it matter that WALL is in the same RAID set?
> Would it be better:
> 4 disks in RAID10 Data
> 2 disks RAID 1 WALL
> 2 hot spares

Well, benchmark it with your app and find out, but generally speaking
unless your database is mostly read you'll see a pretty big benefit to
seperating WAL from table/index data.
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] Blocks read for index scans

2006-04-13 Thread Jim C. Nasby
On Thu, Apr 13, 2006 at 08:36:09PM -0400, Tom Lane wrote:
> Jim Nasby <[EMAIL PROTECTED]> writes:
> > While working on determining a good stripe size for a database, I  
> > realized it would be handy to know what the average request size is.  
> > Getting this info is a simple matter of joining pg_stat_all_tables  
> > and pg_statio_all_tables and doing some math, but there's one issue  
> > I've found; it appears that there's no information on how many heap  
> > blocks were read in by an index scan. Is there any way to get that info?
> 
> If the table is otherwise idle, the change in the table's entry in
> pgstatio_all_tables should do, no?

Ahh, ok, I see the heap blocks are counted. So I guess if you wanted to
know what the average number of blocks read from the heap per request
was you'd have to do heap_blks_read / ( seq_scan + idx_scan ), with the
last two comming from pg_stat_all_tables.

In my case it would be helpful to break the heap access numbers out
between seqscans and index scans, since each of those represents very
different access patterns. Would adding that be a mess?

> (This is as of 8.1 ... older versions acted a bit differently IIRC.)

Yeah; I recall that it was pretty confusing exactly how things were
broken out and that you changed it as part of the bitmap scan work.
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] pg 7.4.x - pg_restore impossibly slow

2006-04-13 Thread Jim C. Nasby
On Thu, Apr 13, 2006 at 06:26:00PM -0700, patrick keshishian wrote:
> $ dropdb dbname
> $ createdb dbname
> $ pg_restore -vsOd dbname dbname.DUMP

That step is pointless, because the next pg_restore will create the
schema for you anyway.

> $ date > db.restore ; pg_restore -vcOd dbname \
> dbname.DUMP ; date >> db.restore
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] Blocks read for index scans

2006-04-13 Thread Terje Elde

Jim Nasby wrote:
While working on determining a good stripe size for a database, I 
realized it would be handy to know what the average request size is. 
Getting this info is a simple matter of joining pg_stat_all_tables and 
pg_statio_all_tables and doing some math, but there's one issue I've 
found; it appears that there's no information on how many heap blocks 
were read in by an index scan. Is there any way to get that info?


RAID usually doesn't work the way most people think. ;)

Not sure how well you know RAID, so I'm just mentioning some points just 
in case, and for the archives.


If your average request is for 16K, and you choose a 16K stripe size, 
then that means half your request (assuming normal bell curve) would be 
larger than a single stripe, and you've just succeeded in having half 
your requests have to have two spindles seek instead of one.  If that's 
done sequentially, you're set for less than half the performance of a 
flat disk.


Knowing what the average stripe size is can be a good place to start, 
but the real question is;  which stripe size will allow the majority of 
your transactions to be possible to satisfy without having to go to two 
spindles?


I've actually had good success with 2MB stripe sizes using software 
raid.  If the reads are fairly well distributed, all the drives are hit 
equally, and very few small requests have to go to two spindles.


Read speeds from modern drives are fast.  It's usually the seeks that 
kill performance, so making sure you reduce the number of seeks should 
almost always be the priority.


That said, it's the transactions against disk that typically matter.  On 
FreeBSD, you can get an impression of this using 'systat -vmstat', and 
watch the KB/t column for your drives.


A seek will take some time, the head has to settle down, find the right 
place to start reading etc, so a seek will always take time.  A seek 
over a longer distance takes more time though, so even if your 
transactions are pretty small, using a large stripe size can be a good 
thing if your have lots of small transactions that are close by.  The 
head will be in the area, reducing seek time.


This all depends on what types of load you have, and it's hard to 
generalize too much on what makes things fast.  As always, it pretty 
much boils down to trying things while running as close to production 
load as you can.


Terje



---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org