Re: [PERFORM] partitioning question 1

2010-10-29 Thread Igor Neyman
 -Original Message-
 From: Ben [mailto:midfi...@gmail.com] 
 Sent: Thursday, October 28, 2010 12:37 PM
 To: pgsql-performance@postgresql.org
 Subject: partitioning question 1
 
 hello --
 
 my last email was apparently too long to respond to so i'll 
 split it up into shorter pieces.  my first question :
 
 my understanding of how range partitioning and constraint 
 exclusion works leads me to believe that it does not buy any 
 query performance that a clustered index doesn't already give 
 you -- the advantages are all in maintainability.  an index 
 is able to eliminate pages just as well as constraint 
 exclusion is able to eliminate table partitions.  the I/O 
 advantages of having queries target small subtables are the 
 same as the I/O advantages of clustering the index : result 
 pages in a small range are very close to each other on disk.
 
 finally, since constraint exclusion isn't as flexible as 
 indexing (i've seen old mailing list posts that say that 
 constraint exclusion only works with static constants in 
 where clauses, and only works with simple operators like ,  
 which basically forces btree indexes when i want to use gist) 
 it is indeed likely that partitioning can be slower than one 
 big table with a clustered index.
 
 is my intuition completely off on this?
 
 best regards, ben
 

If your SELECT retrieves substantial amount of records, table scan could
be more efficient than index access.

Now, if while retrieving large amount of records WHERE clause of this
SELECT still satisfies constraints on some partition(s), then obviously
one (or few) partition scans will be more efficient than full table scan
of non-partitioned table.

So, yes partitioning provides performance improvements, not only
maintenance convenience.

Regards,
Igor Neyman

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] BBU Cache vs. spindles

2010-10-29 Thread Aidan Van Dyk
On Fri, Oct 29, 2010 at 11:43 AM, Robert Haas robertmh...@gmail.com wrote:

 Well, we COULD keep the data in shared buffers, and then copy it into
 an mmap()'d region rather than calling write(), but I'm not sure
 there's any advantage to it.  Managing address space mappings is a
 pain in the butt.

I could see this being a *theoretical* benefit in the case that the
background writer gains the ability to write out all blocks associated
with a file in order.  In that case, you might get a win because you
could get a single mmap of the entire file, and just wholesale memcpy
blocks across, then sync/unmap it.

This, of course assumes a few things that must be for it to be per formant:
0) a list of blocks to be written grouped by files is readily available.
1) The pages you write to must be in the page cache, or your memcpy is
going to fault them in.  With a plain write, you don't need the
over-written page in the cache.
2) Now, instead of the torn-page problem being FS block/sector sized
base, you can now actually have a possibly arbitrary amount of the
block memory written when the kernel writes out the page.  you
*really* need full-page-writes.
3) The mmap overhead required for the kernel to setup the mappings is
less than the repeated syscalls of a simple write().

All those things seem like something that somebody could synthetically
benchmark to prove value before even trying to bolt into PostgreSQL.

a.

-- 
Aidan Van Dyk                                             Create like a god,
ai...@highrise.ca                                       command like a king,
http://www.highrise.ca/                                   work like a slave.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] BBU Cache vs. spindles

2010-10-29 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Thu, Oct 28, 2010 at 5:26 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 It's true that we don't know whether write() causes an immediate or
 delayed disk write, but we generally don't care that much.  What we do
 care about is being able to ensure that a WAL write happens before the
 data write, and with mmap we don't have control over that.

 Well, we COULD keep the data in shared buffers, and then copy it into
 an mmap()'d region rather than calling write(), but I'm not sure
 there's any advantage to it.  Managing address space mappings is a
 pain in the butt.

In principle that ought to be right about the same speed as using
write() to copy the data from shared buffers to kernel disk buffers,
anyway.

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] partitioning question 1

2010-10-29 Thread Ben
On Oct 29, 2010, at 7:38 AM, Igor Neyman wrote:

 is my intuition completely off on this?
 
 best regards, ben
 
 
 If your SELECT retrieves substantial amount of records, table scan could
 be more efficient than index access.
 
 Now, if while retrieving large amount of records WHERE clause of this
 SELECT still satisfies constraints on some partition(s), then obviously
 one (or few) partition scans will be more efficient than full table scan
 of non-partitioned table.
 
 So, yes partitioning provides performance improvements, not only
 maintenance convenience.

my impression was that a *clustered* index would give a lot of the same I/O 
benefits, in a more flexible way.  if you're clustered on the column in 
question, then an index scan for a range is much like a sequential scan over a 
partition (as far as i understand.)

b
-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] partitioning question 1

2010-10-29 Thread Igor Neyman
 

 -Original Message-
 From: Ben [mailto:midfi...@gmail.com] 
 Sent: Friday, October 29, 2010 12:16 PM
 To: Igor Neyman
 Cc: pgsql-performance@postgresql.org
 Subject: Re: partitioning question 1
 
 On Oct 29, 2010, at 7:38 AM, Igor Neyman wrote:
 
  is my intuition completely off on this?
  
  best regards, ben
  
  
  If your SELECT retrieves substantial amount of records, table scan 
  could be more efficient than index access.
  
  Now, if while retrieving large amount of records WHERE clause of 
  this SELECT still satisfies constraints on some partition(s), then 
  obviously one (or few) partition scans will be more efficient than 
  full table scan of non-partitioned table.
  
  So, yes partitioning provides performance improvements, not only 
  maintenance convenience.
 
 my impression was that a *clustered* index would give a lot 
 of the same I/O benefits, in a more flexible way.  if you're 
 clustered on the column in question, then an index scan for a 
 range is much like a sequential scan over a partition (as far 
 as i understand.)
 
 b
 

Even with clustered index you still read index+table, which is more
expensive than just table scan (in situation I described above).
PG clustered index is not the same as SQL Server clustered index (which
includes actual table pages on the leaf level).

Igor Neyman

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] typoed column name, but postgres didn't grump

2010-10-29 Thread Jon Nelson
I've been having trouble with a query.
The query is a cross join between two tables.
Initially, I mis-typed the query, and one of the columns specified in
the query doesn't exist, however the query ran nonetheless.

The actual query:
select gid from t2, t3 where t2.name = t3.name and t3.scope = 'city'
and t3.hierarchy = 'STANDARD' and t2.adiv = t3.adiv limit 1 ;
However, there *is* no column 'name' in table 't2'.
When I ran the query, it took a *really* long time to run (670 seconds).
When I corrected the query to use the right column name (city_name),
the query ran in 28ms.

The question, then, is why didn't the postgres grump about the
non-existent column name?

The version is 8.4.5 on x86_64, openSUSE 11.3

 PostgreSQL 8.4.5 on x86_64-unknown-linux-gnu, compiled by GCC gcc
(SUSE Linux) 4.5.0 20100604 [gcc-4_5-branch revision 160292], 64-bit


-- 
Jon

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] BBU Cache vs. spindles

2010-10-29 Thread Robert Haas
On Fri, Oct 29, 2010 at 11:56 AM, Aidan Van Dyk ai...@highrise.ca wrote:
 1) The pages you write to must be in the page cache, or your memcpy is
 going to fault them in.  With a plain write, you don't need the
 over-written page in the cache.

I seem to remember a time many years ago when I got bitten by this
problem.  The fact that our I/O is in 8K pages means this could be a
pretty severe hit, I think.

 2) Now, instead of the torn-page problem being FS block/sector sized
 base, you can now actually have a possibly arbitrary amount of the
 block memory written when the kernel writes out the page.  you
 *really* need full-page-writes.

Yeah.

 3) The mmap overhead required for the kernel to setup the mappings is
 less than the repeated syscalls of a simple write().

You'd expect to save something from that; but on the other hand, at
least on 32-bit systems, there's a very limited number of 1GB files
that can be simultaneously mapped into one address space, and it's a
lot smaller than the number of file descriptors that you can have
open.   Rumor has it that cutting down the number of fds that can stay
open simultaneously is pretty bad for performance, so cutting it down
to a number you can count on one hand (maybe one finger) would
probably be bad.  Maybe on 64-bit it would be OK but it seems like an
awful lot of complexity for at most a minor savings (and a pretty bad
anti-savings if point #1 kicks in).

Anyway this is all totally off-topic...

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] CPUs for new databases

2010-10-29 Thread Christian Elmerot @ One.com

Hello,

What is the general view of performance CPU's nowadays when it comes to 
PostgreSQL performance? Which CPU is the better choice, in regards to 
RAM access-times, stream speed, cache synchronization etc. Which is the 
better CPU given the limitation of using AMD64 (x86-64)?


We're getting ready to replace our (now) aging db servers with some 
brand new with higher core count. The old ones are 4-socket dual-core 
Opteron 8218's with 48GB RAM. Right now the disk-subsystem is not the 
limiting factor so we're aiming for higher core-count and as well as 
faster and more RAM. We're also moving into the territory of version 9.0 
with streaming replication to be able to offload at least a part of the 
read-only queries to the slave database. The connection count on the 
database usually lies in the region of ~2500 connections and the 
database is small enough that it can be kept entirely in RAM (dump is 
about 2,5GB).


Regards,
Christian Elmerot

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] MVCC and Implications for (Near) Real-Time Application

2010-10-29 Thread Steve Wong
Hi experts,

I have a (near) real-time application in which inserts into the database needs 
to be visible to queries from other threads with minimal delay. The inserts are 
triggered by real-time events and are therefore asynchronous (i.e. many 
performance tips I read related to batch inserts or copy do not apply here, 
since these events cannot be predicted or batched), and the inserted data need 
to be available within a couple of seconds to other threads (for example, an 
inserted row that only appears to other query threads 5 seconds or more after 
the insert is not acceptable). The delay should be under 2 seconds maximum, 
sub-1 second would be great.

My questions are: (1) Does the MVCC architecture introduce significant delays 
between insert by a thread and visibility by other threads (I am unclear about 
how multiple versions are collapsed or reconciled, as well as how different 
query threads are seeing which version)? (2) Are there any available benchmarks 
that can measure this delay? (3) What are relevant config parameters that will 
reduce this delay?

Thanks for your patience with my ignorance of MVCC (still learning more about 
it),
Steve


  

Re: [PERFORM] Stored procedure declared as VOLATILE = no good optimization is done

2010-10-29 Thread Damon Snyder
Thank you for all of the responses. This was really helpful.

Damon

On Sat, Oct 16, 2010 at 12:54 PM, Merlin Moncure mmonc...@gmail.com wrote:

 On Fri, Oct 15, 2010 at 10:31 PM, Tom Lane t...@sss.pgh.pa.us wrote:
  Tatsuo Ishii is...@postgresql.org writes:
  So can I say if a function is marked IMMUTABLE, then it should never
  modify database? Is there any counter example?
  It seems if above is correct, I can say STABLE functions should never
  modify databases as well.
 
  Both of those things are explicitly stated here:
  http://developer.postgresql.org/pgdocs/postgres/xfunc-volatility.html

 Ok, being pedantic here, but:

 I think more interesting is *why* the 'immutable shall not modify the
 database' requirement is there.  IOW, suppose you ignore the warnings
 on the docs and force immutability on a function that writes (via the
 function loophole) to the database, why exactly is this a bad idea?
 The reasoning given in the documentation explains a problematic
 symptom of doing so but gives little technical reasoning what it
 should never be done.

 One reason why writing to the database breaks immutability is that
 writing to the database depends on resources that can change after the
 fact: function immutability also pertains to failure -- if a function
 errors (or not) with a set of inputs, it should always do so.  If you
 write to a table, you could violate a constraint from one call to the
 next, or the table may not even be there at all...

 Writing to the database means you are influencing other systems, and
 via constraints they are influencing you, so it makes it wrong by
 definition.  That said, if you were writing to, say, a table with no
 meaningful constraints this actually wouldn't be so bad as long as you
 can also deal with the other big issue with immutability, namely that
 there is not 1:1 correspondence between when the function is logically
 evaluated and when it is executed.  This more or less eliminates
 logging (at least outside of debugging purposes), the only thing I can
 figure you can usefully do on a table w/no enforceable constraints.
 Also, a big use case for immutable function is to allow use in
 indexing, and it would be just crazy (again, debugging purposes aside)
 to write to a table on index evaluation.

 merlin



Re: [PERFORM] Postgres insert performance and storage requirement compared to Oracle

2010-10-29 Thread Divakar Singh
Dear All,
Thanks for your inputs on the insert performance part.
Any suggestion on storage requirement?
VACUUM is certainly not an option, because this is something related to 
maintenance AFTER insertion. 

I am talking about the plain storage requirement w.r. to Oracle, which I 
observed is twice of Oracle in case millions of rows are inserted.
Anybody who tried to analyze the average storage requirement of PG w.r. to 
Oracle?

 Best Regards,
Divakar





From: Merlin Moncure mmonc...@gmail.com
To: Robert Haas robertmh...@gmail.com
Cc: Mladen Gogala mladen.gog...@vmsinfo.com; pgsql-performance@postgresql.org
Sent: Wed, October 27, 2010 4:46:53 AM
Subject: Re: [PERFORM] Postgres insert performance and storage requirement 
compared to Oracle

On Tue, Oct 26, 2010 at 6:50 PM, Robert Haas robertmh...@gmail.com wrote:
 On Tue, Oct 26, 2010 at 5:54 PM, Mladen Gogala
 mladen.gog...@vmsinfo.com wrote:
 The table is created with on commit obliterate rows option which means
 that there is no need to do truncate. The truncate command is a heavy
 artillery. Truncating a temporary table is like shooting ducks in a duck
 pond, with a howitzer.

 This is just not true.  ON COMMIT DELETE ROWS simply arranges for a
 TRUNCATE to happen immediately before each commit.  See
 PreCommit_on_commit_actions() in tablecmds.c.

quite so.  If you are doing anything performance sensitive with 'on
commit drop', you are better off organizing a cache around
txid_current() (now(), pid for older pg versions).  Skips the writes
to the system catalogs and truncate.

merlin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance



  

Re: [PERFORM] temporary tables, indexes, and query plans

2010-10-29 Thread Justin Pitts
If you alter the default_statistics_target or any of the specific
statistics targets ( via ALTER TABLE SET STATISTICS ) , the change
will not have an effect until an analyze is performed.

This is implied by
http://www.postgresql.org/docs/9.0/static/planner-stats.html and
http://www.postgresql.org/docs/9.0/static/runtime-config-query.html#GUC-DEFAULT-STATISTICS-TARGET,
but it might save questions like this if it were much more explicit.

On Wed, Oct 27, 2010 at 2:52 PM, Jon Nelson jnelson+pg...@jamponi.net wrote:
 On Wed, Oct 27, 2010 at 1:32 PM, Reid Thompson reid.thomp...@ateb.com wrote:
 On Wed, 2010-10-27 at 13:23 -0500, Jon Nelson wrote:
 set it to 500 and restarted postgres.

 did you re-analyze?

 Not recently. I tried that, initially, and there was no improvement.
 I'll try it again now that I've set the stats to 500.
 The most recent experiment shows me that, unless I create whatever
 indexes I would like to see used *before* the large (first) update,
 then they just don't get used. At all. Why would I need to ANALYZE the
 table immediately following index creation? Isn't that part of the
 index creation process?

 Currently executing is a test where I place an ANALYZE foo after the
 COPY, first UPDATE, and first index, but before the other (much
 smaller) updates.

 ..

 Nope. The ANALYZE made no difference. This is what I just ran:

 BEGIN;
 CREATE TEMPORARY TABLE foo
 COPY ...
 UPDATE ... -- 1/3 of table, approx
 CREATE INDEX foo_rowB_idx on foo (rowB);
 ANALYZE ...
 -- queries from here to 'killed' use WHERE rowB = 'someval'
 UPDATE ... -- 7 rows. seq scan!
 UPDATE ... -- 242 rows, seq scan!
 UPDATE .. -- 3700 rows, seq scan!
 UPDATE .. -- 3100 rows, seq scan!
 killed.


 --
 Jon

 --
 Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-performance


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] temporary tables, indexes, and query plans

2010-10-29 Thread Justin Pitts
 Jason Pitts:
 RE: changing default_statistics_target (or via ALTER TABLE SET STATS)
 not taking effect until ANALYZE is performed.

 I did already know that, but it's probably good to put into this
 thread. However, you'll note that this is a temporary table created at
 the beginning of a transaction.


( giving up on replying to the group; the list will not accept my posts )
I've been following the thread so long I had forgotten that. I rather
strongly doubt that analyze can reach that table's content inside that
transaction, if you are creating, populating, and querying it all
within that single transaction.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] MVCC and Implications for (Near) Real-Time Application

2010-10-29 Thread A.M.

On Oct 25, 2010, at 2:46 PM, Steve Wong wrote:

 Hi experts,
 
 I have a (near) real-time application in which inserts into the database 
 needs 
 to be visible to queries from other threads with minimal delay. The inserts 
 are 
 triggered by real-time events and are therefore asynchronous (i.e. many 
 performance tips I read related to batch inserts or copy do not apply here, 
 since these events cannot be predicted or batched), and the inserted data 
 need 
 to be available within a couple of seconds to other threads (for example, an 
 inserted row that only appears to other query threads 5 seconds or more after 
 the insert is not acceptable). The delay should be under 2 seconds maximum, 
 sub-1 second would be great.
 
 My questions are: (1) Does the MVCC architecture introduce significant delays 
 between insert by a thread and visibility by other threads (I am unclear 
 about 
 how multiple versions are collapsed or reconciled, as well as how different 
 query threads are seeing which version)? (2) Are there any available 
 benchmarks 
 that can measure this delay? (3) What are relevant config parameters that 
 will 
 reduce this delay?

There is no way to know without testing whether your hardware, OS, database 
schema, and database load can meet your demands. However, there is no technical 
reason why PostgreSQL could not meet your timing goals- MVCC does not 
inherently introduce delays, however the PostgreSQL implementation requires a 
cleanup process which can introduce latency.

If you find that your current architecture is not up to the task, consider 
using LISTEN/NOTIFY with a payload (new in 9.0), which we are using for a 
similar live-update system.

Cheers,
M



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] MVCC and Implications for (Near) Real-Time Application

2010-10-29 Thread Kevin Grittner
Steve Wong powerpch...@yahoo.com wrote:
 
 (1) Does the MVCC architecture introduce significant delays
 between insert by a thread and visibility by other threads (I am
 unclear about how multiple versions are collapsed or reconciled,
 as well as how different query threads are seeing which version)?
 
As soon as the inserting transaction commits the inserted row is
visible to new snapshots.  If you are in an explicit transaction the
commit will have occurred before the return from the COMMIT request;
otherwise it will have completed before the return from the INSERT
request.
 
You will get a new snapshot for every statement in READ COMMITTED
(or lower) transaction isolation.  You will get a new snapshot for
each database transaction in higher isolation levels.
 
-Kevin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] typoed column name, but postgres didn't grump

2010-10-29 Thread Tom Lane
Jon Nelson jnelson+pg...@jamponi.net writes:
 Initially, I mis-typed the query, and one of the columns specified in
 the query doesn't exist, however the query ran nonetheless.

 The actual query:
 select gid from t2, t3 where t2.name = t3.name and t3.scope = 'city'
 and t3.hierarchy = 'STANDARD' and t2.adiv = t3.adiv limit 1 ;
 However, there *is* no column 'name' in table 't2'.

This is the old automatic-cast-from-record-to-text-string issue,
ie it treats this like (t2.*)::name.

We've been over this a few times before, but it's not clear that
we can make this throw an error without introducing unpleasant
asymmetry into the casting behavior, as in you couldn't get the
cast when you did want it.

BTW this seems pretty far off-topic for pgsql-performance.

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] temporary tables, indexes, and query plans

2010-10-29 Thread Robert Haas
On Wed, Oct 27, 2010 at 3:44 PM, Justin Pitts justinpi...@gmail.com wrote:
 Jason Pitts:
 RE: changing default_statistics_target (or via ALTER TABLE SET STATS)
 not taking effect until ANALYZE is performed.

 I did already know that, but it's probably good to put into this
 thread. However, you'll note that this is a temporary table created at
 the beginning of a transaction.


 ( giving up on replying to the group; the list will not accept my posts )

Evidently it's accepting some of them...

 I've been following the thread so long I had forgotten that. I rather
 strongly doubt that analyze can reach that table's content inside that
 transaction, if you are creating, populating, and querying it all
 within that single transaction.

Actually I don't think that's a problem, at least for a manual ANALYZE.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] typoed column name, but postgres didn't grump

2010-10-29 Thread Kevin Grittner
Tom Lane t...@sss.pgh.pa.us wrote:
 
 BTW this seems pretty far off-topic for pgsql-performance.
 
It is once you understand what's happening.  It was probably the 11+
minutes for the mistyped query run, versus the 28 ms without the
typo, that led them to this list.
 
I remembered this as an issued that has come up before, but couldn't
come up with good search criteria for finding the old thread before
you posted.  If you happen to have a reference or search criteria
for a previous thread, could you post it?  Otherwise, a brief
explanation of why this is considered a feature worth keeping would
be good.  I know it has been explained before, but it just looks
wrong, on the face of it.
 
Playing around with it a little, it seems like a rather annoying
foot-gun which could confuse people and burn a lot of development
time:
 
test=# create domain make text;
CREATE DOMAIN
test=# create domain model text;
CREATE DOMAIN
test=# create table vehicle (id int primary key, make make);
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index
vehicle_pkey for table vehicle
CREATE TABLE
test=# insert into vehicle values (1,
'Toyota'),(2,'Ford'),(3,'Rambler');
INSERT 0 3
test=# select v.make, v.model from vehicle v;
  make   |model
-+-
 Toyota  | (1,Toyota)
 Ford| (2,Ford)
 Rambler | (3,Rambler)
(3 rows)
 
If someone incorrectly thinks they've added a column, and the
purported column name happens to match any character-based type or
domain name, they can get a query which behaves in a rather
unexpected way. In this simple query it's pretty easy to spot, but
it could surface in a much more complex query.  If a mistyped query
runs for 11 days instead of 11 minutes, they may have a hard time
spotting the problem.
 
A typo like this could be particularly hazardous in a DELETE or
UPDATE statement.
 
-Kevin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] BBU Cache vs. spindles

2010-10-29 Thread Alvaro Herrera
Excerpts from Greg Smith's message of jue oct 21 14:04:17 -0300 2010:

 What I would like to do is beef up the documentation with some concrete 
 examples of how to figure out if your cache and associated write path 
 are working reliably or not.  It should be possible to include does 
 this handle full page writes correctly? in that test suite.  Until we 
 have something like that, I'm concerned that bugs in filesystem or 
 controller handling may make full_page_writes unsafe even with a BBU, 
 and we'd have no way for people to tell if that's true or not.

I think if you assume that there are bugs in the filesystem which you
need to protect against, you are already hosed.  I imagine there must be
some filesystem bug that makes it safe to have full_page_writes=on, but
unsafe to have full_page_writes=off; but I'd probably discard those as a
rare minority and thus not worth worrying about.

I agree it would be worth testing though.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] typoed column name, but postgres didn't grump

2010-10-29 Thread Tom Lane
[ please continue any further discussion in pgsql-bugs only ]

Kevin Grittner kevin.gritt...@wicourts.gov writes:
 Tom Lane t...@sss.pgh.pa.us wrote:
 BTW this seems pretty far off-topic for pgsql-performance.
 
 It is once you understand what's happening.  It was probably the 11+
 minutes for the mistyped query run, versus the 28 ms without the
 typo, that led them to this list.
 
 I remembered this as an issued that has come up before, but couldn't
 come up with good search criteria for finding the old thread before
 you posted.  If you happen to have a reference or search criteria
 for a previous thread, could you post it?  Otherwise, a brief
 explanation of why this is considered a feature worth keeping would
 be good.  I know it has been explained before, but it just looks
 wrong, on the face of it.

What's going on here is an unpleasant interaction of several different
features:

1. The notations a.b and b(a) are equivalent: either one can mean the
column b of a table a, or an invocation of a function b() that takes
a's composite type as parameter.  This is an ancient PostQUEL-ism,
but we've preserved it because it is helpful for things like
emulating computed columns via functions.

2. The notation t(x) will be taken to mean x::t if there's no function
t() taking x's type, but there is a cast from x's type to t.  This is
just as ancient as #1.  It doesn't really add any functionality, but
I believe we would break a whole lot of users' code if we took it away.
Because of #1, this also means that x.t could mean x::t.

3. As of 8.4 or so, there are built-in casts available from pretty much
any type (including composites) to all the built-in string types, viz
text, varchar, bpchar, name.

Upshot is that t.name is a cast to type name if there's no column or
user-defined function that can match the call.  We've seen bug reports
on this with respect to both the name and text cases, though I'm
too lazy to trawl the archives for them just now.

So, if you want to throw an error for this, you have to choose which
of these other things you want to break.  I think if I had to pick a
proposal, I'd say we should disable #2 for the specific case of casting
a composite type to something else.  The intentional uses I've seen were
all scalar types; and before 8.4 there was no built-in functionality
that such a call could match.  If we slice off some other part of the
functionality, we risk breaking apps that've worked for many years.

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] BBU Cache vs. spindles

2010-10-29 Thread david

On Fri, 29 Oct 2010, Robert Haas wrote:


On Thu, Oct 28, 2010 at 5:26 PM, Tom Lane t...@sss.pgh.pa.us wrote:

James Mansion ja...@mansionfamily.plus.com writes:

Tom Lane wrote:

The other and probably worse problem is that there's no application
control over how soon changes to mmap'd pages get to disk.  An msync
will flush them out, but the kernel is free to write dirty pages sooner.
So if they're depending for consistency on writes not happening until
msync, it's broken by design.  (This is one of the big reasons we don't
use mmap'd space for Postgres disk buffers.)



Well, I agree that it sucks for the reason you give - but you use
write and that's *exactly* the same in terms of when it gets written,
as when you update a byte on an mmap'd page.


Uh, no, it is not.  The difference is that we can update a byte in a
shared buffer, and know that it *isn't* getting written out before we
say so.  If the buffer were mmap'd then we'd have no control over that,
which makes it mighty hard to obey the WAL write log before data
paradigm.

It's true that we don't know whether write() causes an immediate or
delayed disk write, but we generally don't care that much.  What we do
care about is being able to ensure that a WAL write happens before the
data write, and with mmap we don't have control over that.


Well, we COULD keep the data in shared buffers, and then copy it into
an mmap()'d region rather than calling write(), but I'm not sure
there's any advantage to it.  Managing address space mappings is a
pain in the butt.


keep in mind that you have no way of knowing what order the data in the 
mmap region gets written out to disk.


David Lang
--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] MVCC and Implications for (Near) Real-Time Application

2010-10-29 Thread Pierre C


My questions are: (1) Does the MVCC architecture introduce significant  
delays between insert by a thread and visibility by other threads


As said by others, once commited it is immediately visible to all


(2) Are there any available benchmarks that can measure this delay?


Since you will not be batching INSERTs, you will use 1 INSERT per  
transaction.

If you use Autocommit mode, that's it.
If you don't, you will get a few extra network roundtrips after the  
INSERT, to send the COMMIT.


One INSERT is usually extremely fast unless you're short on RAM and the  
indexes that need updating need some disk seeking.


Anyway, doing lots of INSERTs each in its own transaction is usually very  
low-throughput, because at each COMMIT, postgres must always be sure that  
all the data is actually written to the harddisks. So, depending on the  
speed of your harddisks, each COMMIT can take up to 10-20 milliseconds.


On a 7200rpm harddisk, it is absolutely impossible to do more than 7200  
commits/minute if you want to be sure each time that the data really is  
written on the harddisk, unless :


- you use several threads (one disk write can group several commits from  
different connections, see the config file docs)
- you turn of synchronous_commit ; in this case commit is instantaneous,  
but if your server loses power or crashes, the last few seconds of data  
may be lost (database integrity is still guaranteed though)
- you use a battery backup cache on your RAID controller, in this case  
written to the harddisks is replaced by written to batteyr backed RAM  
which is a lot faster


If you dont use battery backed cache, place the xlog on a different RAID1  
array than the tables/indexes, this allows committing of xlog records  
(which is the time critical part) to proceed smoothly and not be disturbed  
by other IO on the indexes/tables. Also consider tuning your bgwriter and  
checkpoints, after experimentation under realistic load conditions.


So, when you benchmark your application, if you get disappointing results,  
think about this...


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Slow Query- Bad Row Estimate

2010-10-29 Thread Ozer, Pam
Unfortunately I have not received a response on this question.  Is more
information needed?  Does anyone have any ideas why the estimates may be
bad?  Or what I might be able to do to speed this up?

 

thanks

 

From: Ozer, Pam 
Sent: Tuesday, October 26, 2010 4:27 PM
To: 'pgsql-performance@postgresql.org'
Subject: Slow Query- Bad Row Estimate

 

I have the following query:

 

select distinct Region.RegionShort as RegionShort

,County.County as County 

from Region 

join PostalCodeRegionCountyCity on
(PostalCodeRegionCountyCity.RegionId=Region.RegionId) 

join DealerGroupGeoCache on
(DealerGroupGeoCache.RegionId=PostalCodeRegionCountyCity.RegionId) 

and
(DealerGroupGeoCache.CountyId=PostalCodeRegionCountyCity.CountyId) 

and
(DealerGroupGeoCache.CityId=PostalCodeRegionCountyCity.CityId) 

join County on (PostalCodeRegionCountyCity.CountyId=County.CountyId) 

where (DealerGroupGeoCache.DealerGroupId=13) and
(PostalCodeRegionCountyCity.RegionId=5)

 

With the following Explain:

 

HashAggregate  (cost=6743.96..6747.36 rows=34 width=11) (actual
time=854.407..854.425 rows=57 loops=1)

  -  Nested Loop  (cost=0.00..6743.28 rows=34 width=11) (actual
time=0.062..762.698 rows=163491 loops=1)

-  Nested Loop  (cost=0.00..6679.19 rows=34 width=11) (actual
time=0.053..260.001 rows=163491 loops=1)

  -  Index Scan using region_i00 on region
(cost=0.00..3.36 rows=1 width=5) (actual time=0.009..0.011 rows=1
loops=1)

Index Cond: (regionid = 5)

  -  Merge Join  (cost=0.00..6672.43 rows=34 width=10)
(actual time=0.040..189.654 rows=163491 loops=1)

Merge Cond: ((postalcoderegioncountycity.countyid =
dealergroupgeocache.countyid) AND (postalcoderegioncountycity.cityid =
dealergroupgeocache.cityid))

-  Index Scan using postalcoderegioncountycity_i06
on postalcoderegioncountycity  (cost=0.00..716.05 rows=2616 width=10)
(actual time=0.018..1.591 rows=2615 loops=1)

  Index Cond: (regionid = 5)

-  Index Scan using dealergroupgeocache_i01 on
dealergroupgeocache  (cost=0.00..5719.56 rows=9055 width=10) (actual
time=0.015..87.689 rows=163491 loops=1)

  Index Cond:
((dealergroupgeocache.dealergroupid = 13) AND
(dealergroupgeocache.regionid = 5))

-  Index Scan using county_i00 on county  (cost=0.00..1.77
rows=1 width=12) (actual time=0.002..0.002 rows=1 loops=163491)

  Index Cond: (county.countyid =
dealergroupgeocache.countyid)

Total runtime: 854.513 ms

 

The statistics have been recently updated and it does not change the bad
estimates.  

 

The DealerGroupGeoCache Table has 765392 Rows,  And the query returns 57
rows.  

 

I am not at all involved in the way the server is set up so being able
to change the settings is not very likely unless it will make a huge
difference.

 

Is there any way for me to speed up this query without changing the
settings?

 

If not what would you think the changes that would be needed?

 

We are currently running Postgres8.4  with the following settings.

 

shared_buffers = 500MB  #
min 128kB

effective_cache_size = 1000MB

 

max_connections = 100

temp_buffers = 100MB

work_mem = 100MB

maintenance_work_mem = 500MB

max_files_per_process = 1

seq_page_cost = 1.0

random_page_cost = 1.1

cpu_tuple_cost = 0.1

cpu_index_tuple_cost = 0.05

cpu_operator_cost = 0.01

default_statistics_target = 1000

autovacuum_max_workers = 1

 

#log_min_messages = DEBUG1

#log_min_duration_statement = 1000

#log_statement = all

#log_temp_files = 128

#log_lock_waits = on

#log_line_prefix = '%m %u %d %h %p %i %c %l %s'

#log_duration = on

#debug_print_plan = on

 

Any help is appreciated,

 

Pam

 

 

 

 



Re: [PERFORM] BBU Cache vs. spindles

2010-10-29 Thread James Mansion

Tom Lane wrote:

Uh, no, it is not.  The difference is that we can update a byte in a
shared buffer, and know that it *isn't* getting written out before we
  
Well, I don't know where yu got the idea I was refering to that sort of 
thing - its

the same as writing to a buffer before copying to the mmap'd area.

It's true that we don't know whether write() causes an immediate or
delayed disk write, but we generally don't care that much.  What we do
  

Which is what I was refering to.

care about is being able to ensure that a WAL write happens before the
data write, and with mmap we don't have control over that.

  
I think you have just the same control either way, because you can only 
force ordering
with an appropriate explicit sync, and in the absence of such a sync all 
bets are off for
whether/when each disk page is written out, and if you can't ensure that 
the controller
and disk are write through you'd better do a hardware cache flush.too, 
right?


A shame that so many systems have relatively poor handling of that 
hardware flush.



--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Slow Query- Bad Row Estimate

2010-10-29 Thread Josh Berkus
On 10/29/10 1:54 PM, Ozer, Pam wrote:
 -  Index Scan using dealergroupgeocache_i01 on
 dealergroupgeocache  (cost=0.00..5719.56 rows=9055 width=10) (actual
 time=0.015..87.689 rows=163491 loops=1)

This appears to be your problem here.

a) when was dealergroupgeocache last ANALYZED?

b) try increasing the stats_target on dealergroupid and regionid, to say
500 and re-analyzing.

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] BBU Cache vs. spindles

2010-10-29 Thread david

On Fri, 29 Oct 2010, James Mansion wrote:


Tom Lane wrote:

Uh, no, it is not.  The difference is that we can update a byte in a
shared buffer, and know that it *isn't* getting written out before we

Well, I don't know where yu got the idea I was refering to that sort of thing 
- its

the same as writing to a buffer before copying to the mmap'd area.

It's true that we don't know whether write() causes an immediate or
delayed disk write, but we generally don't care that much.  What we do


Which is what I was refering to.

care about is being able to ensure that a WAL write happens before the
data write, and with mmap we don't have control over that.


I think you have just the same control either way, because you can only force 
ordering
with an appropriate explicit sync, and in the absence of such a sync all bets 
are off for
whether/when each disk page is written out, and if you can't ensure that the 
controller

and disk are write through you'd better do a hardware cache flush.too, right?

A shame that so many systems have relatively poor handling of that hardware 
flush.


the issue is that when you update a mmaped chunk of data, it could be 
written out immediatly without you doing _anything_ (and thanks to 
multiple cores/threads, it could get written out while you are still in 
the middle of updating it). When you update an internal buffer and then 
write that, you know that nothing will hit the disk before you issue the 
write command.


David Lang

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Slow Query- Bad Row Estimate

2010-10-29 Thread Tom Lane
Ozer, Pam po...@automotive.com writes:
 Unfortunately I have not received a response on this question.  Is more
 information needed?  Does anyone have any ideas why the estimates may be
 bad?  Or what I might be able to do to speed this up?

The most likely explanation for the bad rowcount estimates is that there
is correlation between the regionid/countyid/cityid columns, only the
planner doesn't know it.  Can you reformulate that data representation
at all, or at least avoid depending on it as a join key?

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Slow Query- Bad Row Estimate

2010-10-29 Thread Kevin Grittner
Ozer, Pam po...@automotive.com wrote:
 
 Is more information needed?
 
Table layouts of the tables involved (including indexes) would be
interesting.  A description of the machine would be useful,
including OS, CPUs, RAM, and disk system.
 
I know you said you might have trouble changing the config, but some
of these seem problematic.
 
 shared_buffers = 500MB
 effective_cache_size = 1000MB
 max_connections = 100
 temp_buffers = 100MB
 
So you will allow up to 10GB to be tied up in space reserved for
temporary tables, but only expect to cache 1GB of your database? 
That hardly seems optimal.
 
 work_mem = 100MB
 
That could be another 10GB or more in work memory at any moment, if
each connection was running a query which needed one work_mem
allocation.
 
 seq_page_cost = 1.0
 random_page_cost = 1.1
 cpu_tuple_cost = 0.1
 cpu_index_tuple_cost = 0.05
 cpu_operator_cost = 0.01
 
Those settings are OK if the active portion of the database is fully
cached.  Is it?
 
 default_statistics_target = 1000
 
If plan times get long with complex queries, you might want to back
that off; otherwise, OK.
 
 autovacuum_max_workers = 1
 
That seems like a bad idea.  Allowing multiple workers helps reduce
bloat and improve statistics.  If autovacuum is affecting
performance, you would be better off tweaking the autovacuum cost
limits.
 
-Kevin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Slow Query- Bad Row Estimate

2010-10-29 Thread Ozer, Pam
I am not sure what you mean by reformulate the data representation.  Do
you mean do I have to join on all three columns? 

-Original Message-
From: Tom Lane [mailto:t...@sss.pgh.pa.us] 
Sent: Friday, October 29, 2010 2:18 PM
To: Ozer, Pam
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Slow Query- Bad Row Estimate 

Ozer, Pam po...@automotive.com writes:
 Unfortunately I have not received a response on this question.  Is
more
 information needed?  Does anyone have any ideas why the estimates may
be
 bad?  Or what I might be able to do to speed this up?

The most likely explanation for the bad rowcount estimates is that there
is correlation between the regionid/countyid/cityid columns, only the
planner doesn't know it.  Can you reformulate that data representation
at all, or at least avoid depending on it as a join key?

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Slow Query- Bad Row Estimate

2010-10-29 Thread Ozer, Pam
I had just analyzed the dealergroupgeochache table. Wow.  Thank you. That did 
the trick. Can you give me an explanation of the default_stats work?  I don't 
think I completely understand what it means when you set it to 500 instead of 
1000?

thanks
-Original Message-
From: pgsql-performance-ow...@postgresql.org 
[mailto:pgsql-performance-ow...@postgresql.org] On Behalf Of Josh Berkus
Sent: Friday, October 29, 2010 2:10 PM
To: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Slow Query- Bad Row Estimate

On 10/29/10 1:54 PM, Ozer, Pam wrote:
 -  Index Scan using dealergroupgeocache_i01 on
 dealergroupgeocache  (cost=0.00..5719.56 rows=9055 width=10) (actual
 time=0.015..87.689 rows=163491 loops=1)

This appears to be your problem here.

a) when was dealergroupgeocache last ANALYZED?

b) try increasing the stats_target on dealergroupid and regionid, to say
500 and re-analyzing.

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Slow Query- Bad Row Estimate

2010-10-29 Thread Tom Lane
Ozer, Pam po...@automotive.com writes:
 I am not sure what you mean by reformulate the data representation.  Do
 you mean do I have to join on all three columns? 

No, I was wondering if you could change things so that you join on just
one column, instead of two that each tell part of the truth.

BTW, did you check your current statistics target?  If it's small
then raising it might possibly fix the problem by itself.

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Slow Query- Bad Row Estimate

2010-10-29 Thread Josh Berkus
On 10/29/10 2:47 PM, Ozer, Pam wrote:
 I had just analyzed the dealergroupgeochache table. Wow.  Thank you. That did 
 the trick. Can you give me an explanation of the default_stats work?  I don't 
 think I completely understand what it means when you set it to 500 instead of 
 1000?

You're already at 1000?


-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Slow Query- Bad Row Estimate

2010-10-29 Thread Ozer, Pam
Yes.  The default statistics target was at 1000.  So that would be what the 
column was using correct?

-Original Message-
From: Josh Berkus [mailto:j...@agliodbs.com] 
Sent: Friday, October 29, 2010 2:55 PM
To: Ozer, Pam
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Slow Query- Bad Row Estimate

On 10/29/10 2:47 PM, Ozer, Pam wrote:
 I had just analyzed the dealergroupgeochache table. Wow.  Thank you. That did 
 the trick. Can you give me an explanation of the default_stats work?  I don't 
 think I completely understand what it means when you set it to 500 instead of 
 1000?

You're already at 1000?


-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Slow Query- Bad Row Estimate

2010-10-29 Thread Tom Lane
Ozer, Pam po...@automotive.com writes:
 Yes.  The default statistics target was at 1000.  So that would be what the 
 column was using correct?

But you evidently didn't have stats.  Perhaps you have autovacuum turned
off?  What PG version is this anyway?

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Slow Query- Bad Row Estimate

2010-10-29 Thread Ozer, Pam
Its 8.4.  On the column stats_target=-1 before I changed it.  AutoVacuum
is set to on.  I actually did a full analyze of the database and then
ran it again.  So what am I missing?

-Original Message-
From: Tom Lane [mailto:t...@sss.pgh.pa.us] 
Sent: Friday, October 29, 2010 3:03 PM
To: Ozer, Pam
Cc: Josh Berkus; pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Slow Query- Bad Row Estimate 

Ozer, Pam po...@automotive.com writes:
 Yes.  The default statistics target was at 1000.  So that would be
what the column was using correct?

But you evidently didn't have stats.  Perhaps you have autovacuum turned
off?  What PG version is this anyway?

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance