Re: [PERFORM] SQL With Dates

2009-04-20 Thread Mark Lewis
It sounds like what you're doing is comparing the planner's cost
estimate from running EXPLAIN on a few different queries.  The planner's
cost estimate was never intended to do what you're trying to do; it's
not an absolute scale of cost, it's just a tool that the planner uses to
get relative comparisons of logically equivalent plans.

The actual number that the planner spits out is meaningless in an
absolute sense.  It's entirely possible that one query with an estimated
cost of 1 will run faster than a query with an estimated cost of
100.  What you actually need to do is compare the real running time of
the queries in order to see which ones are actually problematic.

For that, you'd do better using a tool like pgFouine to look at actual
performance trends. 

-- Mark


On Mon, 2009-04-20 at 10:55 -0300, Rafael Domiciano wrote:
 Hello People,
 
 I have initiated a work to review the sqls of our internal software.
 Lot of them he problem are about sql logic, or join with table
 unecessary, and so on.
 But software has lot of sql with date, doing thinks like:
 [..]
   date = '2009-04-01' AND
   date = '2009-04-15'
 [..]
 
 Redoing the SQL with fix date (date = '2009-04-01') o cost in explain
 always still about 200 or less. But with a period the cost is high,
 about 6000 or more.
 
 Select is using Index and the date is using index too.
 
 There is some way to use date period with less cost?
 
 Rafael Domiciano



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] multicolumn indexes still efficient if not fullystressed?

2009-01-12 Thread Mark Lewis
On Mon, 2009-01-12 at 18:49 +0100, Jörg Kiegeland wrote:
 Hello,
 
 I created a multicolumn index on the columns c_1,..,c_n .
 If I do use only a true subset of these columns in a SQL query, is the 
 index still efficient?
 Or is it better to create another multicolumn index defined on this subset?
 
 Thanks for any comments!

Why would you create a multicolumn index for all columns if that's not
what you actually query on?

The order of columns matter for multicolumn indexes.  Multicolumn
indexes work best for queries that use all of the columns in the index,
but can also be helpful if at least the leftmost columns in the index
are specified in the query.  So it depends on the order.

If the index is defined on (c_1, c_2, c_3, c_4) and your query includes:
WHERE c_2=val AND c_3=val AND c_4=val, then the index is almost
certainly useless.

On the other hand, if you were to query WHERE c_1=val then if c_1 is
highly selective the index would still help.

See here:
http://www.postgresql.org/docs/8.3/interactive/indexes-multicolumn.html

-- Mark Lewis



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] select on 22 GB table causes An I/O error occured while sending to the backend. exception

2008-08-26 Thread Mark Lewis
On Tue, 2008-08-26 at 18:44 +0200, henk de wit wrote:
 Hi,
 
 We're currently having a problem with queries on a medium sized table. This 
 table is 22GB in size (via select 
 pg_size_pretty(pg_relation_size('table'));). It has 7 indexes, which bring 
 the total size of the table to 35 GB (measured with pg_total_relation_size).
 
 On this table we're inserting records with a relatively low frequency of +- 
 6~10 per second. We're using PG 8.3.1 on a machine with two dual core 2.4Ghz 
 XEON CPUs, 16 GB of memory and Debian Linux. The machine is completely 
 devoted to PG, nothing else runs on the box.
 
 Lately we're getting a lot of exceptions from the Java process that does 
 these inserts: An I/O error occured while sending to the backend. No other 
 information is provided with this exception (besides the stack trace of 
 course). The pattern is that for about a minute, almost every insert to this 
 22 GB table results in this exception. After this minute everything is 
 suddenly fine and PG happily accepts all inserts again. We tried to nail the 
 problem down, and it seems that every time this happens, a select query on 
 this same table is in progress. This select query starts right before the 
 insert problems begin and most often right after this select query finishes 
 executing, inserts are fine again. Sometimes though inserts only fail in the 
 middle of the execution of this select query. E.g. if the select query starts 
 at 12:00 and ends at 12:03, inserts fail from 12:01 to 12:02.
 
 We have spend a lot of hours in getting to the bottom of this, but our ideas 
 for fixing this problem are more or less exhausted at the moment.
 
 I wonder if anyone recognizes this problem and could give some pointers to 
 stuff that we could investigate next. 
 
 Thanks a lot in advance.

If the select returns a lot of data and you haven't enabled cursors (by
calling setFetchSize), then the entire SQL response will be loaded in
memory at once, so there could be an out-of-memory condition on the
client.

Or if the select uses sorts and PG thinks it has access to more sort
memory than is actually available on the system (due to ulimits,
physical memory restrictions, etc) then you could run into problems that
look like out-of-memory errors on the server.

If could also be something else entirely; exceeding your max
connections, something like that.

A really good place to start would be to enable tracing on the JDBC
driver.  Look at the docs for the PostgreSQL JDBC driver for how to
enable logging; that should give you a much better picture of exactly
where and what is failing.

If the issue is server-side, then you will also want to look at the
PostgreSQL logs on the server; anything as serious as a backend aborting
should write an entry in the log.

-- Mark

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] PostgreSQL+Hibernate Performance

2008-08-21 Thread Mark Lewis
On Thu, 2008-08-21 at 12:33 +0530, Kranti K K Parisa™ wrote:

 On Wed, Aug 20, 2008 at 8:54 PM, Matthew Wakeling
 [EMAIL PROTECTED] wrote:
 On Wed, 20 Aug 2008, Kranti K K Parisa™ wrote:
 creating multiple indexes on same column will effect
 performance?
  for example:
 
 index1 : column1, column2, column3
 index2: column1
 index3: column2,
 index4: column3
 index5: column1,column2
 
 
 The sole purpose of indexes is to affect performance.
 
 However, if you have index1, there is no point in having
 index2 or index5.
 
 Matthew
 
 Thanks Matthew,
 
 does that mean i can just have index1, index3, index4?
 

(trying to get the thread back into newest-comments-last order)

Well, yes you can get away with just index1, index3 and index4, and it
may well be the optimal solution for you, but it's not entirely
clear-cut.

It's true that PG can use index1 to satisfy queries of the form SELECT
x FROM y WHERE column1=somevalue or column1=a AND column2=b.  It will
not be as fast as an index lookup from a single index, but depending on
the size of the tables/indexes and the selectivity of leading column(s)
in the index, the difference in speed may be trivial.

On the other hand, if you have individual indexes on column1, column2
and column3 but no multi-column index, PG can combine the individual
indexes in memory with a bitmap.  This is not as fast as a normal lookup
in the multi-column index would be, but can still be a big win over not
having an index at all.

To make an educated decision you might want to read over some of the
online documentation about indexes, in particular these two sections:

http://www.postgresql.org/docs/8.3/interactive/indexes-multicolumn.html

and

http://www.postgresql.org/docs/8.3/interactive/indexes-bitmap-scans.html

-- Mark

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] PostgreSQL+Hibernate Performance

2008-08-20 Thread Mark Lewis
On Wed, 2008-08-20 at 17:55 +0530, Kranti K K Parisa™ wrote:
 Hi,
 
 Can anyone suggest the performance tips for PostgreSQL using
 Hibernate.
 
 One of the queries:
 
 - PostgreSQL has INDEX concept and Hibernate also has Column INDEXes.
 Which is better among them? or creating either of them is enough? or
 need to create both of them?
 
 and any more performace aspects ?

Hibernate is a library for accessing a database such as PostgreSQL.  It
does not offer any add-on capabilities to the storage layer itself.  So
when you tell Hibernate that a column should be indexed, all that it
does create the associated PostgreSQL index when you ask Hibernate to
build the DB tables for you.  This is part of Hibernate's effort to
protect you from the implementation details of the underlying database,
in order to make supporting multiple databases with the same application
code easier.

So there is no performance difference between a PG index and a Hibernate
column index, because they are the same thing.

The most useful Hibernate performance-tuning advice isn't PG-specific at
all, there are just things that you need to keep in mind when developing
for any database to avoid pathologically bad performance; those tips are
really beyond the scope of this mailing list, Google is your friend
here.

I've been the architect for an enterprise-class application for a few
years now using PostgreSQL and Hibernate together in a
performance-critical context, and honestly I can't think of one time
that I've been bitten by a PG-specific performance issue (a lot of
performance issues with Hibernate that affected all databases though;
you need to know what you're doing to make Hibernate apps that run fast.
If you do run into problems, you can figure out the actual SQL that
Hibernate is issuing and do the normal PostgreSQL explain analyze on it;
usually caused by a missing index.

-- Mark



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] PostgreSQL+Hibernate Performance

2008-08-20 Thread Mark Lewis
The tradeoffs for multiple indexes are more or less as follows:

1. Having the right indexes makes queries faster, often dramatically so.

2. But more indexes makes inserts/updates slower, although generally not
dramatically slower.

3. Each index requires disk space.  With several indexes, you can easily
have more of your disk taken up by indexes than with actual data.

I would be careful to only create the indexes you need, but it's
probably worse to have too few indexes than too many.  Depends on your
app though.

-- Mark

On Wed, 2008-08-20 at 20:40 +0530, Kranti K K Parisa™ wrote:
 creating multiple indexes on same column will effect performance?
  for example:
 
 index1 : column1, column2, column3
 index2: column1
 index3: column2,
 index4: column3
 index5: column1,column2
 
 which means, i am trying fire the SQL queries keeping columns in the
 where conditions. and the possibilities are like the above.
 
 if we create such indexes will it effect on performance?
 and what is the best go in this case?
 
 
 On Wed, Aug 20, 2008 at 8:10 PM, Mark Lewis [EMAIL PROTECTED]
 wrote:
 On Wed, 2008-08-20 at 17:55 +0530, Kranti K K Parisa™ wrote:
 
 
  Hi,
 
  Can anyone suggest the performance tips for PostgreSQL using
  Hibernate.
 
  One of the queries:
 
  - PostgreSQL has INDEX concept and Hibernate also has Column
 INDEXes.
  Which is better among them? or creating either of them is
 enough? or
  need to create both of them?
 
  and any more performace aspects ?
 
 
 
 Hibernate is a library for accessing a database such as
 PostgreSQL.  It
 does not offer any add-on capabilities to the storage layer
 itself.  So
 when you tell Hibernate that a column should be indexed, all
 that it
 does create the associated PostgreSQL index when you ask
 Hibernate to
 build the DB tables for you.  This is part of Hibernate's
 effort to
 protect you from the implementation details of the underlying
 database,
 in order to make supporting multiple databases with the same
 application
 code easier.
 
 So there is no performance difference between a PG index and a
 Hibernate
 column index, because they are the same thing.
 
 The most useful Hibernate performance-tuning advice isn't
 PG-specific at
 all, there are just things that you need to keep in mind when
 developing
 for any database to avoid pathologically bad performance;
 those tips are
 really beyond the scope of this mailing list, Google is your
 friend
 here.
 
 I've been the architect for an enterprise-class application
 for a few
 years now using PostgreSQL and Hibernate together in a
 performance-critical context, and honestly I can't think of
 one time
 that I've been bitten by a PG-specific performance issue (a
 lot of
 performance issues with Hibernate that affected all databases
 though;
 you need to know what you're doing to make Hibernate apps that
 run fast.
 If you do run into problems, you can figure out the actual SQL
 that
 Hibernate is issuing and do the normal PostgreSQL explain
 analyze on it;
 usually caused by a missing index.
 
 -- Mark
 
 
 
 
 --
 Sent via pgsql-performance mailing list
 (pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-performance
 
 
 
 -- 
 
 Best Regards
 Kranti Kiran Kumar Parisa
 M: +91 - 9391 - 438 - 738
 +91 - 9849 - 625 - 625
 
 

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] PostgreSQL+Hibernate Performance

2008-08-20 Thread Mark Lewis
Yes, we use connection pooling.  As I recall Hibernate ships with c3p0
connection pooling built-in, which is what we use.  We were happy enough
with c3p0 that we ended up moving our other non-hibernate apps over to
it, away from DBCP.

pgpool does connection pooling at a socket level instead of in a local
library level, so really it's a very different thing.  If your app is
the only thing talking to this database, and you don't have a
multi-database configuration, then it will be easier for you to use a
Java-based connection pooling library like c3p0 or DBCP than to use
pgpool.

-- Mark

On Wed, 2008-08-20 at 20:32 +0530, Kranti K K Parisa™ wrote:
 Hi Mark,
 
 Thank you very much for the information. I will analyse the DB
 structure and create indexes on PG directly.
 Are you using any connection pooling like DBCP? or PG POOL?
 
 Regards, KP
 
 
 On Wed, Aug 20, 2008 at 8:05 PM, Mark Lewis [EMAIL PROTECTED]
 wrote:
 
 On Wed, 2008-08-20 at 17:55 +0530, Kranti K K Parisa™ wrote:
  Hi,
 
  Can anyone suggest the performance tips for PostgreSQL using
  Hibernate.
 
  One of the queries:
 
  - PostgreSQL has INDEX concept and Hibernate also has Column
 INDEXes.
  Which is better among them? or creating either of them is
 enough? or
  need to create both of them?
 
  and any more performace aspects ?
 
 
 Hibernate is a library for accessing a database such as
 PostgreSQL.  It
 does not offer any add-on capabilities to the storage layer
 itself.  So
 when you tell Hibernate that a column should be indexed, all
 that it
 does create the associated PostgreSQL index when you ask
 Hibernate to
 build the DB tables for you.  This is part of Hibernate's
 effort to
 protect you from the implementation details of the underlying
 database,
 in order to make supporting multiple databases with the same
 application
 code easier.
 
 So there is no performance difference between a PG index and a
 Hibernate
 column index, because they are the same thing.
 
 The most useful Hibernate performance-tuning advice isn't
 PG-specific at
 all, there are just things that you need to keep in mind when
 developing
 for any database to avoid pathologically bad performance;
 those tips are
 really beyond the scope of this mailing list, Google is your
 friend
 here.
 
 I've been the architect for an enterprise-class application
 for a few
 years now using PostgreSQL and Hibernate together in a
 performance-critical context, and honestly I can't think of
 one time
 that I've been bitten by a PG-specific performance issue (a
 lot of
 performance issues with Hibernate that affected all databases
 though;
 you need to know what you're doing to make Hibernate apps that
 run fast.
 If you do run into problems, you can figure out the actual SQL
 that
 Hibernate is issuing and do the normal PostgreSQL explain
 analyze on it;
 usually caused by a missing index.
 
 -- Mark
 
 
 
 -- 
 
 Best Regards
 Kranti Kiran Kumar Parisa
 M: +91 - 9391 - 438 - 738
 +91 - 9849 - 625 - 625
 
 

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Hardware question for a DB server

2008-03-12 Thread Mark Lewis
What type of usage does it need to scale for?  How many concurrent
connections?  What size database?  Data warehousing or OLTP-type
workloads?  Ratio of reads/writes?  Do you care about losing data?

One question that's likely going to be important depending on your
answers above is whether or not you're getting a battery-backed write
cache for that ServeRAID-8K.

-- Mark Lewis

On Wed, 2008-03-12 at 19:58 +0100, Pascal Cohen wrote:
 Hello, we plan to buy a dedicated server to host our database.
 Here is the proposal I was given (with a second identical server fro 
 backup using log shipping):
 =
 IBM X3650 (This is a 2U server, can hold 8 Drives)
 2 x QC Xeon E5450 (3.0GHz 12MB L2 1333MHz 80W)
 8 x 2GB RAM (16GB total)
 2.5 SAS Hotswap
 ServeRAID-8K SAS Controller
 8 x 73GB 15K 2.5 SAS Drive
 CD/DVD Drive
 Remote Supervisor Adapter II Slimline
 Redundant Power
 4 Year, 24x7 2hour support/warranty
 
 =
 
 I would like specialists advices.
 
 If you need additional details, please let me know.
 
 Thanks in advance for your help
 
 Thank you
 
 Pascal
 

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Query slows after offset of 100K

2008-02-14 Thread Mark Lewis
Michael,

Our application had a similar problem, and what we did to avoid having
people click into the middle of 750k records was to show the first page
with forward/back links but no link to go to the middle.  So people
could manually page forward as far as they want, but nobody is going to
sit there clicking next 37k times.  We have several thousand users and
none of them complained about the change.  Maybe it's because at the
same time as we made that change we also improved the rest of the
searching/filtering interface.  But I think that really people don't
need to jump to the middle of the records anyway as long as you have
decent search abilities.

If you wanted to keep your same GUI, one workaround would be to
periodically update a table which maps page number to first unique
key on page.  That decouples the expensive work to generate the page
offsets from the query itself, so if your data changes fairly
infrequently it might be appropriate.  Sort of a materialized-view type
approach.

If you can be approximate in your GUI you can do a lot more with this
optimization-- if people don't necessarily need to be able to go
directly to page 372898 but instead would be satisfied with a page
roughly 47% of the way into the massive result set (think of a GUI
slider), then you wouldn't need to update the lookup table as often even
if the data changed frequently, because adding a few thousand records to
a 750k row result set is statistically insignificant, so your markers
wouldn't need to be updated very frequently and you wouldn't need to
store a marker for each page, maybe only 100 markers spread evenly
across the result set would be sufficient.

-- Mark Lewis


On Thu, 2008-02-14 at 19:49 +, Michael Lorenz wrote:
 Fair enough, and I did think of this as well.  However, I didn't think this 
 was a viable option in my case, since we're currently allowing the user to 
 randomly access the pages (so $lastkey wouldn't really have any meaning).  
 The user can choose to sort on object ID, name or modification time, and then 
 go straight to any page in the list.  With 750K records, that's around 37K 
 pages.
 
 Maybe a better way to phrase my question is:  how can I paginate my data on 3 
 different keys which allow random access to any given page, and still get 
 reasonable performance?  Should I just force the user to limit their result 
 set to some given number of records before allowing any paginated access?  Or 
 is it just not practical, period?
 
 Thanks,
 Michael Lorenz
 
 
  To: [EMAIL PROTECTED]
  CC: pgsql-performance@postgresql.org
  Subject: Re: [PERFORM] Query slows after offset of 100K 
  Date: Thu, 14 Feb 2008 14:08:15 -0500
  From: [EMAIL PROTECTED]
  
  Michael Lorenz  writes:
  My query is as follows:
  SELECT o.objectid, o.objectname, o.isactive, o.modificationtime 
  FROMobject o 
  WHERE  ( o.deleted = false OR o.deleted IS NULL ) 
  AND  o.accountid = 111 
  ORDER BY 2 
  LIMIT 20 OFFSET 1;
  
  This is guaranteed to lose --- huge OFFSET values are never a good idea
  (hint: the database still has to fetch those rows it's skipping over).
  
  A saner way to do pagination is to remember the last key you displayed
  and do something like WHERE key $lastkey ORDER BY key LIMIT 20,
  which will allow the database to go directly to the desired rows,
  as long as you have an index on the key.  You do need a unique ordering
  key for this to work, though.
  
  regards, tom lane
  
  ---(end of broadcast)---
  TIP 1: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly
 
 _
 Your Future Starts Here. Dream it? Then be it! Find it at www.seek.com.au
 http://a.ninemsn.com.au/b.aspx?URL=http%3A%2F%2Fninemsn%2Eseek%2Ecom%2Eau%2F%3Ftracking%3Dsk%3Ahet%3Ask%3Anine%3A0%3Ahot%3Atext_t=764565661_r=OCT07_endtext_Future_m=EXT
 ---(end of broadcast)---
 TIP 7: You can help support the PostgreSQL project by donating at
 
 http://www.postgresql.org/about/donate

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] Benchmark Data requested --- pgloader CE design ideas

2008-02-07 Thread Mark Lewis
  I was thinking of not even reading the file content from the controller
  thread, just decide splitting points in bytes (0..ST_SIZE/4 -
  ST_SIZE/4+1..2*ST_SIZE/4 etc) and let the reading thread fine-tune by
  beginning to process input after having read first newline, etc.
 
 The problem I was pointing out is that if chunk#2 moved foward a few bytes 
 before it started reading in search of a newline, how will chunk#1 know 
 that it's supposed to read up to that further point?  You have to stop #1 
 from reading further when it catches up with where #2 started.  Since the 
 start of #2 is fuzzy until some reading is done, what you're describing 
 will need #2 to send some feedback to #1 after they've both started, and 
 that sounds bad to me.  I like designs where the boundaries between 
 threads are clearly defined before any of them start and none of them ever 
 talk to the others.

I don't think that any communication is needed beyond the beginning of
the threads.  Each thread knows that it should start at byte offset X
and end at byte offset Y, but if Y happens to be in the middle of a
record then just keep going until the end of the record.  As long as the
algorithm for reading past the end marker is the same as the algorithm
for skipping past the beginning marker then all is well.

-- Mark Lewis

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [PERFORM] Autovacuum running out of memory

2007-10-16 Thread Mark Lewis
On Tue, 2007-10-16 at 10:14 -0400, Jason Lustig wrote:
 I ran ulimit -a for the postgres user, and here's what I got:
...
 max memory size (kbytes, -m) 20
 open files  (-n) 100
 max user processes  (-u) 100
 virtual memory  (kbytes, -v) 20
...

These settings are all quite low for a dedicated database server, they
would be more appropriate for a small development instance of PG sharing
a machine with several other processes.

Others have commented on the memory settings, but depending on the
maximum number of connections you expect to have open at any time you
may want to consider increasing the max user processes and open files
settings as well.

-- Mark Lewis

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] quickly getting the top N rows

2007-10-04 Thread Mark Lewis
On Thu, 2007-10-04 at 11:00 -0700, Ben wrote:
 If I have this:
 
 create table foo (bar int primary key);
 
 ...then in my ideal world, Postgres would be able to use that index on bar 
 to help me with this:
 
 select bar from foo order by bar desc limit 20;
 
 But in my experience, PG8.2 is doing a full table scan on foo, then 
 sorting it, then doing the limit. I have a more complex primary key, but I 
 was hoping the same concept would still apply. Am I doing something wrong, 
 or just expecting something that doesn't exist?

It has to do with the way that NULL values are stored in the index.
This page has details and instructions for how to get it to work:

http://developer.postgresql.org/pgdocs/postgres/indexes-ordering.html

-- Mark Lewis

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] Low CPU Usage

2007-09-19 Thread Mark Lewis
On Wed, 2007-09-19 at 10:38 -0700, [EMAIL PROTECTED] wrote:
 Hi all.
 Recently I have installed a brand new server with a Pentium IV 3.2
 GHz, SATA Disk, 2GB of Ram in Debian 4.0r1 with PostgreSQL 8.2.4
 (previously a 8.1.9).
 I have other similar server with an IDE disk, Red Hat EL 4 and
 PostgreSQL 8.2.3
 
 I have almost the same postgresql.conf in both servers, but in the new
 one (I have more work_mem than the other one) things go really slow.
 I began to monitor i/o disk and it's really ok, I have test disk with
 hdparm and it's 5 times faster than the IDE one.
 Running the same queries in both servers in the new one it envolves
 almost 4 minutes instead of 18 seconds in the old one.
 Both databases are the same, I have vacuum them and I don't know how
 to manage this issue.
 The only weird thing is than in the older server running the query it
 uses 30% of CPU instead of 3 o 5 % of the new one!!!
 What's is happening with this server? I upgrade from 8.1.9 to 8.2.4
 trying to solve this issue but I can't find a solution.
 
 Any ideas?

It could be a planning issue.  Have you analyzed the new database to
gather up-to-date statistics?  A comparison of EXPLAIN ANALYZE results
for an example query in both systems should answer that one.

Another possibility because you're dealing with lower-end drives is that
you have a case of one drive lying about fsync where the other is not.
If possible, try running your test with fsync=off on both servers.  If
there's a marked improvement on the new server but no significant change
on the old server then you've found your culprit.

-- Mark Lewis


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Index files

2007-09-14 Thread Mark Lewis
On Sat, 2007-09-15 at 01:51 +0530, Harsh Azad wrote:
 Great, creating new tablespace for indexes worked! Now the question is
 whether existing tables/index can be moved to the new tablespace using
 an alter command or the only way possible is to drop and recreate
 them?

You can alter an existing index: 

http://www.postgresql.org/docs/8.2/static/sql-alterindex.html



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] SAN vs Internal Disks

2007-09-06 Thread Mark Lewis
On Thu, 2007-09-06 at 18:05 +0530, Harsh Azad wrote:
 Hi,
 
 We are currently running our DB on a DualCore, Dual Proc 3.Ghz Xeon,
 8GB RAM, 4x SAS 146 GB 15K RPM on RAID 5.
 
 The current data size is about 50GB, but we want to purchase the
 hardware to scale to about 1TB as we think our business will need to
 support that much soon. 
 - Currently we have a 80% read and 20% write perecntages. 
 - Currently with this configuration the Database is showing signs of
 over-loading.
 - Auto-vaccum, etc run on this database, vaccum full runs nightly.
 - Currently CPU loads are about 20%, memory utilization is full (but
 this is also due to linux caching disk blocks) and IO waits are
 frequent.
 - We have a load of about 400 queries per second
 
 Now we are considering to purchase our own servers and in the process
 are facing the usual dilemmas. First I'll list out what machine we
 have decided to use: 
 2x Quad Xeon 2.4 Ghz (4-way only 2 populated right now)
 32 GB RAM
 OS Only storage - 2x SCSI 146 GB 15k RPM on RAID-1
 (Data Storage mentioned below)
 
 We have already decided to split our database into 3 machines on the
 basis on disjoint sets of data. So we will be purchasing three of
 these boxes. 
 
 HELP 1: Does something look wrong with above configuration, I know
 there will be small differences b/w opetron/xeon. But do you think
 there is something against going for 2.4Ghz Quad Xeons (clovertown i
 think)?
 
 HELP 2: The main confusion is with regards to Data Storage. We have
 the option of going for:
 
 A: IBM N-3700 SAN Box, having 12x FC 300GB disks,  Partitioned into 3
 disks into RAID-4 for WAL/backup, and 9 disks on RAID-DP for data, 2
 hot spare. We are also considering similar solution from EMC -
 CX310C. 
 
 B: Go for Internal of DAS based storage. Here for each server we
 should be able to have: 2x disks on RAID-1 for logs, 6x disks on
 RAID-10 for tablespace1 and 6x disks on RAID-10 for tablespace2. Or
 maybe 12x disks on RAID-10 single table-space. 
 
 What do I think? Well..
 SAN wins on manageability, replication (say to a DR site), backup,
 etc...
 DAS wins on cost
 
 But for a moment keeping these aside, i wanted to discuss, purely on
 performance side which one is a winner? It feels like internal-disks
 will perform better, but need to understand a rough magnitude of
 difference in performance to see if its worth loosing the
 manageability features. 
 
 Also if we choose to go with DAS, what would be the best tool to do
 async replication to DR site and maybe even as a extra plus a second
 read-only DB server to distribute select loads.

Sounds like a good candidate for Slony replication for backups /
read-only slaves.

I haven't seen a SAN yet whose DR / replication facilities are on par
with a good database replication solution.  My impression is that those
facilities are mostly for file servers, mail servers, etc.  It would be
difficult for a SAN to properly replicate a database given the strict
ordering, size and consistency requirements for the data files.  Not
impossible, but in my limited experience I haven't found one that I
trust to do it reliably either, vendor boastings to the contrary
notwithstanding.  (Hint: make sure you know exactly what your vendor's
definition of the term 'snapshot' really means).

So before you invest in a SAN, make sure that you're actually going to
be able to (and want to) use all the nice management features you're
paying for.  We have some SAN's that are basically acting just as
expensive external RAID arrays because we do the database
replication/backup in software anyway.

-- Mark Lewis



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] SAN vs Internal Disks

2007-09-06 Thread Mark Lewis
On Thu, 2007-09-06 at 22:28 +0530, Harsh Azad wrote:
 Thanks Mark.
 
 If I replicate a snapshot of Data and log files (basically the entire
 PG data directory) and I maintain same version of postgres on both
 servers, it should work right?
 
 I am also thinking that having SAN storage will provide me with
 facility of keeping a warm standby DB. By just shutting one server
 down and starting the other mounting the same File system I should be
 able to bing my DB up when the primary inccurs a physical failure. 
 
 I'm only considering SAN storage for this feature - has anyone ever
 used SAN for replication and warm standy-by on Postgres?
 
 Regards,
 Harsh


We used to use a SAN for warm standby of a database, but with Oracle and
not PG.  It worked kinda sorta, except for occasional crashes due to
buggy drivers.

But after going through the exercise, we realized that we hadn't gained
anything over just doing master/slave replication between two servers,
except that it was more expensive, had a tendency to expose buggy
drivers, had a single point of failure in the SAN array, failover took
longer and we couldn't use the warm standby server to perform read-only
queries.  So we reverted back and just used the SAN as expensive DAS and
set up a separate box for DB replication.

So if that's the only reason you're considering a SAN, then I'd advise
you to spend the extra money on more DAS disks.

Maybe I'm jaded by past experiences, but the only real use case I can
see to justify a SAN for a database would be something like Oracle RAC,
but I'm not aware of any PG equivalent to that.

-- Mark Lewis

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] join tables vs. denormalization by trigger

2007-09-04 Thread Mark Lewis
On Tue, 2007-09-04 at 20:53 +0200, Walter Mauritz wrote:
 Hi,
 
 I wonder about differences in performance between two scenarios:
 
 Background:
 Table A, ~50,000 records
 Table B, ~3,000,000 records (~20 cols)
 Table C, ~30,000,000 records (~10 cols)
 
 a query every 3sec. with limit 10
 
 Table C depends on Table B wich depends on Table A, int8 foreign key, btree 
 index
 
 * consider it a read only scenario (load data only in night, with time for 
 vacuum analyze daily)
 * im required to show records from Table C, but also with some (~5cols) info 
 from Table B 
 * where clause always contains the foreign key to Table A
 * where clause may contain further 1-10 search parameter
 
 
 Scenario A)
 simply inner join Table B + C
 
 Scenario B)
 with use of trigger on insert/update I could push the required information 
 from table B down to table C.
 - so i would only require to select from table C.
 
 
 My question:
 1) From your experience ... how much faster (approximately) in percent do you 
 regard Scenario B faster than A ?

You're assuming that B is always going to be faster than A, which
certainly isn't a foregone conclusion.  Let's say that you average 10
bytes per column.  In scenario A, the total data size is then roughly
3,000,000 * 20 * 10 + 30,000,000 * 10 * 10 = 3.6 GiB.  In scenario B due
to your denormalization, the total data size is more like 30,000,000 *
30 * 10 = 9 GiB, or 2.5 times more raw data.

That's a lot of extra disk IO, unless your database will always fit in
memory in both scenarios.

Although you didn't provide enough data to answer with certainty, I
would go on the assumption that A is going to be faster than B.  But
even if it weren't, remember that premature optimization is the root of
all evil.  If you try A and it doesn't perform fast enough, then you can
always try B later to see if it works any better.

-- Mark Lewis

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] 8.2 Autovacuum BUG ?

2007-08-31 Thread Mark Lewis
On Fri, 2007-08-31 at 12:25 -0400, Pallav Kalva wrote:
 Can you please correct me if I am wrong, I want to understand how this 
 works.
 Based on what you said, it will run autovacuum again when it passes 200M 
 transactions, as SELECTS are transactions too and are going on these 
 tables.
 But the next time when it runs autovacuum, it shouldnt freeze the tuples 
 again as they are already frozen and wont generate lot of archive logs ?
 Or is this because of it ran autovacuum for the first time on this db ? 
 just the first time it does this process ?

That is correct.  The tuples are now frozen, which means that they will
not need to be frozen ever again unless you insert/update any records.


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: Fwd: [PERFORM] 8.2 Query 10 times slower than 8.1 (view-heavy)

2007-08-31 Thread Mark Lewis
On Fri, 2007-08-31 at 19:39 -0400, Tom Lane wrote:
 I wrote:
  Mark Lewis [EMAIL PROTECTED] writes:
  We've been holding back from upgrading to 8.2 because this one is a
  show-stopper for us.
 
  Well, you could always make your own version with this patch reverted:
  http://archives.postgresql.org/pgsql-committers/2006-11/msg00066.php
  I might end up doing that in the 8.2 branch if a better solution
  seems too large to back-patch.
 
 I thought of a suitably small hack that should cover at least the main
 problem without going so far as to revert that patch entirely.  What we
 can do is have the IS NULL estimator recognize when the clause is being
 applied at an outer join, and not believe the table statistics in that
 case.  I've applied the attached patch for this --- are you interested
 in trying it out on your queries before 8.2.5 comes out?

Wish I could, but I'm afraid that I'm not going to be in a position to
try out the patch on the application that exhibits the problem for at
least the next few weeks.

-- Mark

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] [Solved] Postgres performance problem

2007-08-30 Thread Mark Lewis
Perhaps you had a long-running transaction open (probably a buggy or
hung application) that was preventing dead rows from being cleaned up.
Restarting PG closed the offending connection and rolled back the
transaction, which allowed vacuum to clean up all the dead rows.

If you're not running regular VACUUMs at all but are instead exclusively
running VACUUM FULL, then I don't think you would see warnings about
running out of fsm enties, which would explain why you did not notice
the bloat.  I haven't confirmed that though, so I might be wrong.

-- Mark Lewis

On Thu, 2007-08-30 at 11:50 +0200, Ruben Rubio wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1
 
 
 
 Hi ...
 
 Seems its solved. But the problem is not found.
 
 As you may know, I do a vacuum full and a reindex database each day. I
 have logs that confirm that its done and I can check that everything was
  fine.
 
 So, this morning, I stopped the website, I stopped database, started it
 again. (I was around 200  days without restarting), then I vacuum
 database and reindex it (Same command as everyday) . Restart again, and
 run again the website.
 
 Now seems its working fine. But I really does not know where is the
 problem. Seems vacuum its not working fine?  Maybe database should need
 a restart? I really don't know.
 
 Does someone had a similar problem?
 
 Thanks in advance,
 Ruben Rubio
 
 
 -BEGIN PGP SIGNATURE-
 Version: GnuPG v1.4.6 (GNU/Linux)
 Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
 
 iD8DBQFG1pLMIo1XmbAXRboRAqgQAKCkWcZYE8RDppEVI485wDLnIW2SfQCfV+Hj
 e8PurQb2TOSYDPW545AJ83c=
 =dQgM
 -END PGP SIGNATURE-
 ---(end of broadcast)---
 TIP 4: Have you searched our list archives?
 
http://archives.postgresql.org

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] LIKE query verses =

2007-08-29 Thread Mark Lewis
On Wed, 2007-08-29 at 18:01 +0530, Karthikeyan Mahadevan wrote:
 
 *
  
 1)   
 
 EXPLAIN ANALYSE SELECT
 job_category.job_id,job.name,job.state,job.build_id,cat.name as
 reporting_group 
 FROM category,job_category,job,category as cat 
 WHERE job.job_id=job_category.job_id 
 AND job_category.category_id=category.category_id 
 AND cat.build_id=category.build_id 
 AND category.name = 'build_id.pap3260-20070828_01' 
 AND cat.name like ('reporting_group.Tier2%'); 
 
 QUERY PLAN 
 --
  
  Nested Loop  (cost=0.00..291.53 rows=8 width=103) (actual
 time=98.999..385.590 rows=100 loops=1) 
-  Nested Loop  (cost=0.00..250.12 rows=9 width=34) (actual
 time=98.854..381.106 rows=100 loops=1) 
  -  Nested Loop  (cost=0.00..123.22 rows=1 width=34) (actual
 time=98.717..380.185 rows=1 loops=1) 
-  Index Scan using idx_cat_by_name on category cat
  (cost=0.00..5.97 rows=1 width=34) (actual time=95.834..245.276
 rows=977 loops=1) 
  Index Cond: (((name)::text =
 'reporting'::character varying) AND ((name)::text 
 'reportinh'::character varying)) 
  Filter: ((name)::text ~~
 'reporting_group.Tier2%'::text) 
-  Index Scan using idx_cat_by_bld_id on category
  (cost=0.00..117.24 rows=1 width=8) (actual time=0.134..0.134 rows=0
 loops=977) 
  Index Cond: (outer.build_id =
 category.build_id) 
  Filter: ((name)::text =
 'build_id.pap3260-20070828_01'::text) 
  -  Index Scan using idx_jcat_by_cat_id on job_category
  (cost=0.00..126.00 rows=71 width=8) (actual time=0.126..0.569
 rows=100 loops=1) 
Index Cond: (job_category.category_id =
 outer.category_id) 
-  Index Scan using job_pkey on job  (cost=0.00..4.59 rows=1
 width=73) (actual time=0.033..0.036 rows=1 loops=100) 
  Index Cond: (job.job_id = outer.job_id) 
 
  Total runtime: 385.882 ms 
 --
  

Remember that using LIKE causes PG to interpret an underscore as 'any
character', which means that it can only scan the index for all records
that start with 'reporting', and then it needs to apply a filter to each
match.  This is going to be slower than just going directly to the
matching index entry.

What you probably want to do is tell PG that you're looking for a
literal underscore and not for any matching character by escaping the
underscore, that will allow it to do a much quicker index scan.
Something like:

cat.name like 'reporting|_group.Tier2%' ESCAPE '|'

-- Mark Lewis

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: Fwd: [PERFORM] 8.2 Query 10 times slower than 8.1 (view-heavy)

2007-08-28 Thread Mark Lewis
It looks like your view is using a left join to look for rows in one
table without matching rows in the other, i.e. a SQL construct similar
in form to the query below:

SELECT ...
FROM A LEFT JOIN B ON (...)
WHERE B.primary_key IS NULL

Unfortunately there has been a planner regression in 8.2 in some cases
with these forms of queries.  This was discussed a few weeks (months?)
ago on this forum.  I haven't looked closely enough to confirm that this
is the problem in your case, but it seems likely.  Is it possible to
refactor the query to avoid using this construct to see if that helps?

We've been holding back from upgrading to 8.2 because this one is a
show-stopper for us.

-- Mark Lewis

On Tue, 2007-08-28 at 11:24 -0500, Evan Carroll wrote:
 -- Forwarded message --
 From: Evan Carroll [EMAIL PROTECTED]
 Date: Aug 28, 2007 11:23 AM
 Subject: Re: [PERFORM] 8.2 Query 10 times slower than 8.1 (view-heavy)
 To: Scott Marlowe [EMAIL PROTECTED]
 
 
 On 8/28/07, Scott Marlowe [EMAIL PROTECTED] wrote:
  I looked through your query plan, and this is what stood out in the 8.2 
  plan:
 
   -  Nested Loop Left Join  (cost=8830.30..10871.27 rows=1
  width=102) (actual time=2148.444..236018.971 rows=62 loops=1)
 Join Filter: ((public.contact.pkid =
  public.contact.pkid) AND (public.event.ts_in  public.event.ts_in))
 Filter: (public.event.pkid IS NULL)
 
  Notice the misestimation is by a factor of 62, and the actual time
  goes from 2149 to 236018 ms.
 
  Again, have you analyzed your tables  / databases?
 
 contacts=# \o scott_marlowe_test
 contacts=# VACUUM FULL ANALYZE;
 contacts=# SELECT * FROM test_view WHERE U_ID = 8;
 Cancel request sent
 ERROR:  canceling statement due to user request
 contacts=# EXPLAIN ANALYZE SELECT * FROM test_view WHERE U_ID = 8;
 
 output found at http://rafb.net/p/EQouMI82.html
 
 --
 Evan Carroll
 System Lord of the Internets
 [EMAIL PROTECTED]
 832-445-8877
 
 

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] Indexscan is only used if we use limit n

2007-08-15 Thread Mark Lewis
On Wed, 2007-08-15 at 16:36 -0300, Sebastián Baioni wrote:
 Hello,
 Whe are running PostgreSQL 8.2.0 on amd64-portbld-freebsd6.2, compiled
 by GCC cc (GCC) 3.4.6 [FreeBSD] 20060305.
 The query only uses the index if we have a limit n:
 
 Without Limit n
 explain
 select esapcuit, esapcuil
 from esact00 t1
 order by esapcuit, esapcuil
 
 Sort  (cost=843833.82..853396.76 rows=3825177 width=30)
   Sort Key: esapcuit, esapcuil
   -  Seq Scan on esact00 t1  (cost=0.00..111813.77 rows=3825177
 width=30)
 
 With Limit n
 explain
 select esapcuit, esapcuil
 from esact00 t1
 order by esapcuit, esapcuil
 limit 1

This isn't really unexpected-- it's faster to do a full sequential scan
of a table than it is to do a full index traversal over the table.  And
usually it's still cheaper even after sorting the results of the full
table scan.

So as near as we can tell, PG is just doing what it's supposed to do and
picking the best plan it can.

You didn't really ask a question-- is this causing problems somehow, or
were you just confused by the behavior?

-- Mark

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] Performance problems with large telemetric datasets on 7.4.2

2007-08-03 Thread Mark Lewis
On Fri, 2007-08-03 at 06:52 -0700, Sven Clement wrote:
 Hello everybody,
 
 as I'm new to this list I hope that it is the right place to post this
 and also the right format, so if I'm committing an error, I apologize
 in advance.
 
 First the background of my request: 
 
 I'm currently employed by an enterprise which has approx. 250 systems
 distributed worldwide which are sending telemetric data to the main
 PostgreSQL.
 The remote systems are generating about 10 events per second per
 system which accumulates to about 2500/tps. 
 The data is stored for about a month before it is exported and finally
 deleted from the database.
 On the PostgreSQL server are running to databases one with little
 traffic (about 750K per day) and the telemetric database with heavy
 write operations all around the day (over 20 million per day). 
 We already found that the VACUUM process takes excessively long and as
 consequence the database is Vacuumed permanently.
 
 The hardware is a IBM X306m Server, 3.2 GHz HT (Pentium IV), 1 GB RAM
 and 2x 250 GB HDD (SATA-II) with ext3 fs, one of the HDD is dedicated
 to database. 
 OS is Debian 3.1 Sarge with PostgreSQL 7.4.7 (7.4.7-6sarge1) with the
 libpq frontend library.
 
 Now the problem:
 
 The problem we are experiencing is that our queries are slowing down
 continuously even if we are performing queries on the index which is
 the timestamp of the event, a simple SELECT query with only a simple
 WHERE clause ( or ) takes very long to complete. So the database
 becomes unusable for production use as the data has to be retrieved
 very quickly if we want to act based on the telemetric data. 
 
 So I'm asking me if it is useful to update to the actual 8.2 version
 and if we could experience performance improvement only by updating.
 
 Thank you for your answers,
 Sven Clement

Upgrading from 7.4.x to 8.2.x will probably give you a performance
benefit, yes.  There have been numerous changes since the days of 7.4.

But you didn't really give any information about why the query is
running slow.  Specifically, could you provide the query itself, some
information about the tables/indexes/foreign keys involved, and an
EXPLAIN ANALYZE for one of the problematic queries?

Also, what kind of vacuuming regimen are you using?  Just a daily cron
maybe?  Are you regularly analyzing the tables?

-- Mark Lewis

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] Postgres optimizer

2007-08-03 Thread Mark Lewis
On Fri, 2007-08-03 at 13:58 -0400, Mouhamadou Dia wrote:
 Hello,
 
 I have a Postgres instance (version 8.1) running on a Solaris 10
 machine. When I run the following query
 
  
 
 SELECT *  FROM PROR_ORG,  ( ( ( ( (PRPT_PRT LEFT OUTER JOIN
 PRPT_PRTADR ON 
 
 PRPT_PRT.PRT_NRI = PRPT_PRTADR.PRT_NRI AND
 PRPT_PRTADR.ADR_F_DEF=true)  
 
 LEFT OUTER JOIN PLGE_CTY ON PRPT_PRTADR.CTY_NRI = PLGE_CTY.CTY_NRI)
 LEFT 
 
 OUTER JOIN PLGE_CTY1 PLGE_CTY_PLGE_CTY1 ON PLGE_CTY.CTY_NRI = 
 
 PLGE_CTY_PLGE_CTY1.CTY_NRI AND PLGE_CTY_PLGE_CTY1.LNG_CD = 'fr')
 LEFT 
 
 OUTER JOIN PLGE_CTRSD ON PRPT_PRTADR.CTRSD_CD = PLGE_CTRSD.CTRSD_CD 
 
 AND PRPT_PRTADR.CTR_ISO_CD = PLGE_CTRSD.CTR_ISO_CD)  LEFT OUTER JOIN 
 
 PLGE_CTR ON PRPT_PRTADR.CTR_ISO_CD = PLGE_CTR.CTR_ISO_CD) , PROR_ORG1 
 
 PROR_ORG_PROR_ORG1, PROR_ORGT, PROR_ORGT1 PROR_ORGT_PROR_ORGT1 
 
 WHERE (  (PROR_ORG.ORGT_CD = PROR_ORGT.ORGT_CD) AND 
 
 (PROR_ORGT.ORGT_CD = PROR_ORGT_PROR_ORGT1.ORGT_CD AND 
 
 PROR_ORGT_PROR_ORGT1.LNG_CD = 'fr') AND (PROR_ORG.PRT_NRI = 
 
 PROR_ORG_PROR_ORG1.PRT_NRI AND PROR_ORG_PROR_ORG1.LNG_CD = 'fr') AND 
 
 (PROR_ORG.PRT_NRI = PRPT_PRT.PRT_NRI) )  AND ( ((PROR_ORG.ORGT_CD
 ='CHAIN')) )
 
  
 
 it takes 45 seconds to run. In this case the optimizer does a
 sequential scan of the PRPT_PRT table (which is the largest one)
 despite the existence of an index on PRT_NRI column of PRPT_PRT table.
 
  I’ve activated the GEQO but it still takes nearly the same time to
 run (between 40 and 45s).
 
 When I change the order of PRPT_PRT and PROR_ORG tables, it takes only
 30 milliseconds to run. In this case the optimizer uses the index on
 PRT_NRI column of PRPT_PRT table, what is normal and what I was
 expecting.
 
 Is there a known problem with the Postgres optimizer?
 
 For your information, the same query takes 20 milliseconds to run on
 Informix and 60 milliseconds to run on Oracle independently of the
 order of the tables in the query.
 
  
 
 PRPT_PRT has 1.3 millions rows
 
 PRPT_PRTADR has 300.000 rows
 
 PROR_ORG has 1500 rows
 
 These are the largest tables, all the others are small tables. All
 statistics are up to date.

If I recall correctly, PG 8.2 was the first version where the planner
supported reordering outer joins.  Prior releases would get poor
performance unless the joins were listed in the right order.

So it is quite possible that upgrading to 8.2 would solve your problem.
Do you have the ability to try that?

-- Mark Lewis

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] disk filling up

2007-07-26 Thread Mark Lewis
On Thu, 2007-07-26 at 09:18 -0700, Brandon Shalton wrote:
 Hello all,
 
 My hard disk is filling up in the /base  directory  to where it has consumed 
 all 200gig of that drive.
 
 All the posts that i see keep saying move to a bigger drive, but at some 
 point a bigger drive would just get consumed.
 
 How can i keep the disk from filling up other than get like a half TB setup 
 just to hold the ./base/*  folder


Ummm, don't put more 200G worth of data in there?  :)

You didn't give us any information about what you're using the database
for, why you think that using 200G is excessive, what version of the
database you're running, stuff like that.  So really there's nothing
that we can help you out with, except for the normal read the manual
about vacuuming and make sure you're doing it newbie answer.

-- Mark

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] insert vs select into performance

2007-07-17 Thread Mark Lewis
If you're performing via JDBC, are you using addBatch/executeBatch, or
are you directly executing each insert?  If you directly execute each
insert, then your code will wait for a server round-trip between each
insert.

That still won't get you to the speed of select into, but it should
help.  You could also look at the pgsql-jdbc archives for the JDBC
driver patches which allow you to use COPY-style bulk loading, which
should get you to the performance level of COPY, which should be
reasonably close to the performance of select into.

-- Mark Lewis

On Tue, 2007-07-17 at 22:50 +0200, Thomas Finneid wrote:
 
 Michael Glaesemann wrote:
  
  On Jul 17, 2007, at 14:38 , Thomas Finneid wrote:
  
  I was doing some testing on insert compared to select into. I 
  inserted 100 000 rows (with 8 column values) into a table, which took 
  14 seconds, compared to a select into, which took 0.8 seconds.
  (fyi, the inserts where batched, autocommit was turned off and it all 
  happend on the local machine)
 
  Now I am wondering why the select into is that much faster?
  
  It would be helpful if you included the actual queries you're using, as 
  there are a number of variables:
 
 create table ciu_data_type
 (
   id  integer,
   loc_id  integer,
   value1  integer,
   value2  real,
   value3  integer,
   value4  real,
   value5  real,
   value6  char(2),
   value7  char(3),
   value8  bigint,
   value9  bigint,
   value10 real,
   value11 bigint,
   value12 smallint,
   value13 double precision,
   value14 real,
   value15 real,
   value16 char(1),
   value17 varchar(18),
   value18 bigint,
   value19 char(4)
 );
 
 performed with JDBC
 
 insert into ciu_data_type (id, loc_id, value3, value5, value8, value9, 
 value10, value11 ) values (?,?,?,?,?,?,?,?)
 
 select * into ciu_data_type_copy from ciu_data_type
 
  1) If there are any constraints on the original table, the INSERT will 
  be checking those constraints. AIUI, SELECT INTO does not generate any 
  table constraints.
 
 No constraints in this test.
 
  2a) Are you using INSERT INTO foo (foo1, foo2, foo2) SELECT foo1, foo2, 
  foo3 FROM pre_foo or individual inserts for each row? The former would 
  be faster than the latter.
  
  2b) If you are doing individual inserts, are you wrapping them in a 
  transaction? The latter would be faster.
 
 disabling autocommit, but nothing more than that
 
 
 I havent done this test in a stored function yet, nor have I tried it 
 with a C client so far, so there is the chance that it is java/jdbc that 
 makes the insert so slow. I'll get to that test soon if there is any 
 chance my theory makes sence.
 
 regards
 
 thomas
 
 
 ---(end of broadcast)---
 TIP 2: Don't 'kill -9' the postmaster

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Delete Cascade FK speed issue

2007-07-03 Thread Mark Lewis
On Tue, 2007-07-03 at 08:05 +0200, Patric de Waha wrote:
 Hi,
 I've dbase with about 80 relations.
 On deleting a user, this cascades through all the tables.
 This is very slow, for 20 users it takes 4 hours, with exclusive 
 access to the dbase.
 No other users connected to the dbase.
 
 Ok I know there will be somewhere a relation with a FK without 
 index, which
 is being scanned sequentially. But how can I find out what postgres 
 is doing
 while it is handling the transaction?
 
 Is there a way I can find out what postgres does, and where it hangs 
 around, so I know
 where the FK might not be indexed. (The dbase is to big to analyze 
 it by hand).
 
 The  way I do it now is to check the pg_locks relation, but this is 
 not very representative.
 
 Is there profiling method for triggers/constraints, or a method 
 which gives me a hint
 why it is taking so long?

In 8.1 and later, an EXPLAIN ANALYZE of the delete will show you the
amount of time spent in each trigger.  Remember that it will still
perform the delete, so if you want to be able to re-run the DELETE over
and over as you add missing indexes, run it in a transaction and
rollback each time.  That will tell you which foreign key constraint
checks are taking up time.  The output will not be nearly as useful if
you don't name your foreign key constraints, but is still better than
nothing.

Alternatively, you can just dump the schema to a text file and spend 30
minutes and some text searching to reconstruct your foreign key
dependency graph rooted at the table in question and check each column
for proper indexes.  We recently did this for a 150 relation database,
it's not as painful as you seem to think it is.  An 80 relation database
is by no means too big to analyze :)

-- Mark Lewis

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Volunteer to build a configuration tool

2007-06-20 Thread Mark Lewis
On Wed, 2007-06-20 at 11:21 -0400, Greg Smith wrote:
...
 One of the things that was surprising to me when I started looking at the 
 organization of the PostgreSQL buffer cache is how little gross 
 information about its contents is available.  I kept expecting to find a 
 summary section where you could answer questions like how much of the 
 cache currently has information about index/table X? used as an input to 
 the optimizer.  I understand that the design model expects much of this is 
 unknowable due to the interaction with the OS cache, and in earlier 
 versions you couldn't make shared_buffers big enough for its contents to 
 be all that interesting, so until recently this wasn't worth collecting.
 
 But in the current era, where it's feasible to have multi-GB caches 
 efficiently managed by PG and one can expect processor time is relatively 
 cheap, it seems to me one way to give a major boost to the optimizer is to 
 add some overhead to buffer cache management so it collects such 
 information.  When I was trying to do a complete overhaul on the 
 background writer, the #1 problem was that I had to assemble my own 
 statistics on what was inside the buffer cache as it was scanned, because 
 a direct inspection of every buffer is the only way to know things like 
 what percentage of the cache is currently dirty.
...

One problem with feeding the current state of the buffer cache to the
planner is that the planner may be trying to prepare a plan which will
execute 10,000 times.  For many interesting queries, the state of the
cache will be very different after the first execution, as indexes and
active portions of tables are brought in.

For that matter, an early stage of query execution could significantly
change the contents of the buffer cache as seen by a later stage of the
execution, even inside a single query.

I'm not saying that inspecting the buffer cache more is a bad idea, but
gathering useful information with the current planner is a bit tricky.

For purposes of idle speculation, one could envision some non-trivial
changes to PG which would make really slick use this data:

(1) Allow PG to defer deciding whether to perform an index scan or
sequential scan until the moment it is needed, and then ask the buffer
cache what % of the pages from the relevant indexes/tables are currently
cached.

(2) Automatically re-plan prepared queries with some kind of frequency
(exponential in # of executions?  fixed-time?), to allow the plans to
adjust to changes in the buffer-cache.

Besides being hard to build, the problem with these approaches (or any
other approach which takes current temporary state into account) is that
as much as some of us might want to make use of every piece of data
available to make the planner into a super-brain, there are lots of
other folks who just want plan stability.  The more dynamic the system
is, the less predictable it can be, and especially in mission-critical
stuff, predictability matters more than .  Tom said it really well in a
recent post, 

To me, the worst catch-22 we face in this area is that we'd like the
optimizer's choices of plan to be stable and understandable, but the
real-world costs of queries depend enormously on short-term conditions
such as how much of the table has been sucked into RAM recently by
other queries.  I have no good answer to that one.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] Performance query about large tables, lots of concurrent access

2007-06-19 Thread Mark Lewis
On Tue, 2007-06-19 at 09:37 -0400, Karl Wright wrote:
 Alvaro Herrera wrote:
  Karl Wright wrote:
  
  This particular run lasted four days before a VACUUM became essential. 
  The symptom that indicates that VACUUM is needed seems to be that the 
  CPU usage of any given postgresql query skyrockets.  Is this essentially 
  correct?
  
  Are you saying you weren't used to run VACUUM all the time?  If so,
  that's where the problem lies.
  
 
 Postgresql 7.4 VACUUM runs for so long that starting it with a cron job 
 even every 24 hours caused multiple instances of VACUUM to eventually be 
 running in my case.  So I tried to find a VACUUM schedule that permitted 
 each individual vacuum to finish before the next one started.  A vacuum 
 seemed to require 4-5 days with this particular database - or at least 
 it did for 7.4.  So I had the VACUUM schedule set to run every six days.
 
 I will be experimenting with 8.1 to see how long it takes to complete a 
 vacuum under load conditions tonight.

The longer you wait between vacuuming, the longer each vacuum is going
to take.  

There is of course a point of diminishing returns for vacuum where this
no longer holds true; if you vacuum too frequently the overhead of
running the vacuum will dominate the running time.  But 6 days for a
busy database is probably way, way, way past that threshold.

Generally, the busier the database the more frequently you need to
vacuum, not less.  If your update/delete transaction rate is high enough
then you may need to vacuum multiple times per hour, at least on some
tables.  Playing with autovacuum might help you out here, because it can
look at how badly a vacuum is needed and adjust the vacuuming rate on
the fly on a per-table basis.  Be sure to look up some reasonable
autovacuum settings first; the 8.1 defaults aren't.

-- Mark

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Thousands of tables versus on table?

2007-06-04 Thread Mark Lewis
On Mon, 2007-06-04 at 13:40 -0400, Thomas Andrews wrote:
 I have several thousand clients.  Our clients do surveys, and each survey
 has two tables for the client data,
 
responders
responses
 
 Frequent inserts into both table.
 
 Right now, we are seeing significant time during inserts to these two
 tables.

Can you provide some concrete numbers here?  Perhaps an EXPLAIN ANALYZE
for the insert, sizes of tables, stuff like that?

 Some of the indices in tableA and tableB do not index on the client ID
 first.

What reason do you have to think that this matters?

 So, we are considering two possible solutions.
 
  (1) Create separate responders and responses tables for each client.
 
  (2) Make sure all indices on responders and responses start with the
   client id (excepting, possibly, the primary keys for these fields) and
   have all normal operation queries always include an id_client.
 
 Right now, for example, given a responder and a survey question, we do a
 query in responses by the id_responder and id_survey.  This gives us a
 unique record, but I'm wondering if maintaining the index on
 (id_responder,id_survey) is more costly on inserts than maintaining the
 index (id_client,id_responder,id_survey) given that we also have other
 indices on (id_client,...).
 
 Option (1) makes me very nervous.  I don't like the idea of the same sorts
 of data being stored in lots of different tables, in part for long-term
 maintenance reasons.  We don't really need cross-client reporting, however.

What version of PG is this?  What is your vacuuming strategy?  Have you
tried a REINDEX to see if that helps?

-- Mark Lewis

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] LIKE search and performance

2007-05-24 Thread Mark Lewis
On Thu, 2007-05-24 at 21:54 +0100, James Mansion wrote:
  If Sybase is still like SQL Server (or the other way around), it *may*
  end up scanning the index *IFF* the index is a clustered index. If it's
  a normal index, it will do a sequential scan on the table.
 

 Are you sure its not covered?  Have to check at work - but I'm off next 
 week so it'll have to wait.
 
  It's not a win on PostgreSQL, because of our MVCC implementation. We
  need to scan *both* index *and* data pages if we go down that route, in
  which case it's a lot faster to just scan the data pages alone.
 

 Why do you need to go to all the data pages - doesn't the index 
 structure contain all the keys so
 you prefilter and then check to see if the *matched* items are still in 
 view?  I'll be first to admit I
 know zip about Postgres, but it seems odd - doesn't the index contain 
 copies of the key values?.
 
 I suspect that I mis-spoke with 'leaf'.  I really just mean 'all index 
 pages with data', since the scan
 does not even need to be in index order, just a good way to get at the 
 data in a compact way.

PG could scan the index looking for matches first and only load the
actual rows if it found a match, but that could only be a possible win
if there were very few matches, because the difference in cost between a
full index scan and a sequential scan would need to be greater than the
cost of randomly fetching all of the matching data rows from the table
to look up the visibility information.  

So yes it would be possible, but the odds of it being faster than a
sequential scan are small enough to make it not very useful.

And since it's basically impossible to know the selectivity of this kind
of where condition, I doubt the planner would ever realistically want to
choose that plan anyway because of its poor worst-case behavior.

-- Mark

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Ever Increasing IOWAIT

2007-05-18 Thread Mark Lewis
You're not swapping are you?  One explanation could be that PG is
configured to think it has access to a little more memory than the box
can really provide, which forces it to swap once it's been running for
long enough to fill up its shared buffers or after a certain number of
concurrent connections are opened.

-- Mark Lewis

On Fri, 2007-05-18 at 10:45 +1200, Ralph Mason wrote:
 We have a database running on a 4 processor machine.  As time goes by
 the IO gets worse and worse peeking at about 200% as the machine loads
 up.
 
  
 
 The weird thing is that if we restart postgres it’s fine for hours but
 over time it goes bad again.
 
  
 
 (CPU usage graph here
 http://www.flickr.com/photos/[EMAIL PROTECTED]/502596262/ )  You can clearly
 see where the restart happens in the IO area
 
  
 
 This is Postgres  8.1.4 64bit.
 
  
 
 Anyone have any ideas?
 
  
 
 Thanks
 
 Ralph
 
  
 
 
 
 --
 Internal Virus Database is out-of-date.
 Checked by AVG Free Edition.
 Version: 7.5.432 / Virus Database: 268.15.9/573 - Release Date:
 5/12/2006 4:07 p.m.
 
 

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [PERFORM] Throttling PostgreSQL's CPU usage

2007-05-08 Thread Mark Lewis
1. If you go the route of using nice, you might want to run the 3D
front-end at a higher priority instead of running PG at a lower
priority.  That way apache, php and the other parts all run at the same
priority as PG and just the one task that you want to run smoothly is
elevated.

2. You may not even need separate priorities if you're running on Linux
with a recent kernel and you enable the sleep() calls that you would
need anyway for solution #1 to work.  This is because Linux kernels are
getting pretty good nowadays about rewarding tasks with a lot of sleeps,
although there are some further kernel changes still under development
that look even more promising.

-- Mark

On Tue, 2007-05-08 at 16:27 -0400, Daniel Griscom wrote:
 I'm building a kiosk with a 3D front end accessing PostGIS/PostgreSQL 
 via Apache/PHP. The 3D display is supposed to show smooth motion from 
 location to location, with PostGIS giving dynamically updated 
 information on the locations. Everything runs on the same machine, 
 and it all works, but when I start a query the 3D display stutters 
 horribly. It looks like PostgreSQL grabs hold of the CPU and doesn't 
 let go until it's completed the query.
 
 I don't need the PostgreSQL query to return quickly, but I must 
 retain smooth animation while the query is being processed. In other 
 words, I need PostgreSQL to spread out its CPU usage so that it 
 doesn't monopolize the CPU for any significant time (more than 50ms 
 or so).
 
 Possible solutions:
 
 1: Set the PostgreSQL task priority lower than the 3D renderer task, 
 and to make sure that the 3D renderer sleep()s enough to let 
 PostgreSQL get its work done. The obvious objection to this obvious 
 solution is Priority inversion!, but I see that as an additional 
 challenge to be surmounted rather than an absolute prohibition. So, 
 any thoughts on setting the PostgreSQL task priority (including by 
 the much-maligned tool shown at 
 http://weblog.bignerdranch.com/?p=11)?
 
 2: Some variation of the Cost-Based Vacuum Delay. Hypothetically, 
 this would have the PostgreSQL task sleep() periodically while 
 processing the query, allowing the 3D renderer to continue working at 
 a reduced frame rate. My understanding, however, is that this only 
 works during VACUUM and ANALYZE commands, so it won't help during my 
 SELECT commands. So, any thoughts on using Cost-Based Vacuum Delay as 
 a Cost-Based Select Delay?
 
 3: ... some other solution I haven't thought of.
 
 
 Any thoughts, suggestions, ideas?
 
 
 Thanks,
 Dan
 

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] Feature Request --- was: PostgreSQL Performance Tuning

2007-04-27 Thread Mark Lewis
Maybe he's looking for a switch for initdb that would make it
interactive and quiz you about your expected usage-- sort of a magic
auto-configurator wizard doohicky?  I could see that sort of thing being
nice for the casual user or newbie who otherwise would have a horribly
mis-tuned database.  They could instead have only a marginally mis-tuned
database :)

On Fri, 2007-04-27 at 10:30 -0400, Michael Stone wrote:
 On Fri, Apr 27, 2007 at 09:27:49AM -0400, Carlos Moreno wrote:
 Notice that the second part of my suggestion covers this --- have 
 additional
 switches to initdb so that the user can tell it about estimates on how 
 the DB
 will be used:  estimated size of the DB, estimated percentage of 
 activity that
 will involve writing, estimated percentage of activity that will be 
 transactions,
 percentage that will use indexes, percentage of queries that will be 
 complex,
 etc. etc.
 
 If the person knows all that, why wouldn't they know to just change the 
 config parameters?
 
 Mike Stone
 
 ---(end of broadcast)---
 TIP 4: Have you searched our list archives?
 
http://archives.postgresql.org

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] a question about Direct I/O and double buffering

2007-04-05 Thread Mark Lewis
Not to hijack this thread, but has anybody here tested the behavior of
PG on a file system with OS-level caching disabled via forcedirectio or
by using an inherently non-caching file system such as ocfs2?

I've been thinking about trying this setup to avoid double-caching now
that the 8.x series scales shared buffers better, but I figured I'd ask
first if anybody here had experience with similar configurations.

-- Mark

On Thu, 2007-04-05 at 13:09 -0500, Erik Jones wrote:
 On Apr 5, 2007, at 12:09 PM, Xiaoning Ding wrote:
 
  Hi,
  
  
  A page may be double buffered in PG's buffer pool and in OS's buffer
  cache.
  Other DBMS like DB2 and Oracle has provided Direct I/O option to
  eliminate
  double buffering. I noticed there were discusses on the list. But
  I can not find similar option in PG. Does PG support direct I/O now?
  
  
  The tuning guide of PG usually recommends a small shared buffer pool
  (compared
  to the size of physical memory).  I think it is to avoid swapping.
  If
  there were
  swapping, OS kernel may swap out some pages in PG's buffer pool even
  PG
  want to keep them in memory. i.e. PG would loose full control over
  buffer pool.
  A large buffer pool is not good because it may
  1. cause more pages double buffered, and thus decrease the
  efficiency of
  buffer
  cache and buffer pool.
  2. may cause swapping.
  Am I right?
  
  
  If PG's buffer pool is small compared with physical memory, can I
  say
  that the
  hit ratio of PG's buffer pool is not so meaningful because most
  misses
  can be
  satisfied by OS Kernel's buffer cache?
  
  
  Thanks!
 
 
 To the best of my knowledge, Postgres itself does not have a direct IO
 option (although it would be a good addition).  So, in order to use
 direct IO with postgres you'll need to consult your filesystem docs
 for how to set the forcedirectio mount option.  I believe it can be
 set dynamically, but if you want it to be permanent you'll to add it
 to your fstab/vfstab file.
 
 
 erik jones [EMAIL PROTECTED]
 software developer
 615-296-0838
 emma(r)
 
 
 
 
 

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] a question about Direct I/O and double buffering

2007-04-05 Thread Mark Lewis
...
[snipped for brevity]
...
 
  Not to hijack this thread, but has anybody here tested the behavior
  of
  PG on a file system with OS-level caching disabled via forcedirectio
  or
  by using an inherently non-caching file system such as ocfs2?
  
  
  I've been thinking about trying this setup to avoid double-caching
  now
  that the 8.x series scales shared buffers better, but I figured I'd
  ask
  first if anybody here had experience with similar configurations.
  
  
  -- Mark
 
 
 Rather than repeat everything that was said just last week, I'll point
 out that we just had a pretty decent discusson on this last week that
 I started, so check the archives.  In summary though, if you have a
 high io transaction load with a db where the average size of your
 working set of data doesn't fit in memory with room to spare, then
 direct io can be a huge plus, otherwise you probably won't see much of
 a difference.  I have yet to hear of anybody actually seeing any
 degradation in the db performance from it.  In addition, while it
 doesn't bother me, I'd watch the top posting as some people get pretty
 religious about (I moved your comments down).

I saw the thread, but my understanding from reading through it was that
you never fully tracked down the cause of the factor of 10 write volume
mismatch, so I pretty much wrote it off as a data point for
forcedirectio because of the unknowns.  Did you ever figure out the
cause of that?

-- Mark Lewis

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] How long should it take to insert 200,000 records?

2007-02-06 Thread Mark Lewis
On Tue, 2007-02-06 at 12:01 -0500, Merlin Moncure wrote:
 On 2/6/07, Scott Marlowe [EMAIL PROTECTED] wrote:
  On Tue, 2007-02-06 at 10:40, Merlin Moncure wrote:
   On 2/6/07, Scott Marlowe [EMAIL PROTECTED] wrote:
On Mon, 2007-02-05 at 18:35, Karen Hill wrote:
 I have a pl/pgsql function that is inserting 200,000 records for
 testing purposes.  What is the expected time frame for this operation
 on a pc with 1/2 a gig of ram and a 7200 RPM disk?   The processor is
 a 2ghz cpu.  So far I've been sitting here for about 2 million ms
 waiting for it to complete, and I'm not sure how many inserts postgres
 is doing per second.
   
That really depends.  Doing 200,000 inserts as individual transactions
will be fairly slow.  Since PostgreSQL generally runs in autocommit
mode, this means that if you didn't expressly begin a transaction, you
are in fact inserting each row as a transaction.  i.e. this:
  
   I think OP is doing insertion inside a pl/pgsql loop...transaction is
   implied here.
 
  Yeah, I noticed that about 10 seconds after hitting send... :)
 
 actually, I get the stupid award also because RI check to unindexed
 column is not possible :)  (this haunts deletes, not inserts).

Sure it's possible:

CREATE TABLE parent (col1 int4);
-- insert many millions of rows into parent
CREATE TABLE child  (col1 int4 REFERENCES parent(col1));
-- insert many millions of rows into child, very very slowly.


- Mark Lewis



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] How long should it take to insert 200,000 records?

2007-02-06 Thread Mark Lewis
On Tue, 2007-02-06 at 14:06 -0500, Merlin Moncure wrote:
 On 2/6/07, Mark Lewis [EMAIL PROTECTED] wrote:
   actually, I get the stupid award also because RI check to unindexed
   column is not possible :)  (this haunts deletes, not inserts).
 
  Sure it's possible:
 
  CREATE TABLE parent (col1 int4);
  -- insert many millions of rows into parent
  CREATE TABLE child  (col1 int4 REFERENCES parent(col1));
  -- insert many millions of rows into child, very very slowly.
 
 the database will not allow you to create a RI link out unless the
 parent table has a primary key/unique constraint, which the database
 backs with an indexand you can't even trick it afterwards by
 dropping the constraint.
 
 it's the other direction, when you cascade forwards when you can have
 a problem.  this is most common with a delete, but can also happen on
 an update of a table's primary key with child tables referencing it.
 

Hmmm, should check my SQL before hitting send I guess.  Well, at least
you no longer have to wear the stupid award, Merlin :)

-- Mark Lewis

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [PERFORM] Slow update with simple query

2006-12-14 Thread Mark Lewis
Out of curiosity, how hard would it be to modify the output of EXPLAIN
ANALYZE when doing an insert/update to include the index update times
and/or non-FK constraint checking times and/or the table row update
times?  Or any other numbers that might be useful in circumstances like
this.  I'm wondering if it's possible to shed some light on the
remaining dark shadows of PG performance troubleshooting.

-- Mark Lewis

On Thu, 2006-12-14 at 11:19 -0500, Tom Lane wrote:
 Arnaud Lesauvage [EMAIL PROTECTED] writes:
  Tom Lane a crit :
  It seems the time must be going into this trigger function.  What
  does it do?
 
  A lot of things ! Indeed, if it runs it will very badly hurt performances 
  (table 
  lookups, string manipulation, etc...) !
  But it should only be tringered on INSERTs, and I am doing an UPDATE !
 
 Doh, right, I obviously still need to ingest more caffeine this morning.
 
 I think the conclusion must be that there was just too much I/O to be
 done to update all the rows.  Have you done any tuning of shared_buffers
 and so forth?  I recall having seen cases where update performance went
 bad as soon as the upper levels of a large index no longer fit into
 shared_buffers ...
 
   regards, tom lane
 
 ---(end of broadcast)---
 TIP 6: explain analyze is your friend

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] File Systems Compared

2006-12-06 Thread Mark Lewis
 Anyone run their RAIDs with disk caches enabled, or is this akin to
 having fsync off?

Disk write caches are basically always akin to having fsync off.  The
only time a write-cache is (more or less) safe to enable is when it is
backed by a battery or in some other way made non-volatile.

So a RAID controller with a battery-backed write cache can enable its
own write cache, but can't safely enable the write-caches on the disk
drives it manages.

-- Mark Lewis

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] RES: Priority to a mission critical transaction

2006-11-29 Thread Mark Lewis
On Wed, 2006-11-29 at 08:25 -0500, Brian Hurt wrote:
...
 I have the same question.  I've done some embedded real-time 
 programming, so my innate reaction to priority inversions is that 
 they're evil.  But, especially given priority inheritance, is there any 
 situation where priority inversion provides *worse* performance than 
 running everything at the same priority?  I can easily come up with 
 situations where it devolves to that case- where all processes get 
 promoted to the same high priority.  But I can't think of one where 
 using priorities makes things worse, and I can think of plenty where it 
 makes things better.
...

It can make things worse when there are at least 3 priority levels
involved.  The canonical sequence looks as follows:

LOW:  Aquire a lock
MED:  Start a long-running batch job that hogs CPU
HIGH: Wait on lock held by LOW task

at this point, the HIGH task can't run until the LOW task releases its
lock. but the LOW task can't run to completion and release its lock
until the MED job completes.

(random musing): I wonder if PG could efficiently be made to temporarily
raise the priority of any task holding a lock that a high priority task
waits on.  I guess that would just make it so that instead of HIGH tasks
being effectively reduced to LOW, then LOW tasks could be promoted to
HIGH.

-- Mark Lewis

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Simple join optimized badly?

2006-10-11 Thread Mark Lewis
Tom,

I'm interested in the problem of cross-column statistics from a
theoretical perspective.  It would be interesting to sit down and try to
reason out a useful solution, or at very least to understand the problem
better so I can anticipate when it might come and eat me.

From my understanding, the main problem is that if PG knows the
selectivity of n conditions C1,C2,...,Cn then it doesn't know whether
the combined selectivity will be C1*C2*...*Cn (conditions are
independent) or max(C1,C2,...,Cn) (conditions are strictly dependent),
or somewhere in the middle. Therefore, row estimates could be orders of
magnitude off.

I suppose a common example would be a table with a serial primary key
column and a timestamp value which is always inserted as
CURRENT_TIMESTAMP, so the two columns are strongly correlated.  If the
planner guesses that 1% of the rows of the table will match pk100,
and 1% of the rows of the table will match timestamp  X, then it would
be nice for it to know that if you specify both pk100 AND
timestampX that the combined selectivity is still only 1% and not 1% *
1% = 0.01%.

As long as I'm sitting down and reasoning about the problem anyway, are
there any other types of cases you're aware of where some form of cross-
column statistics would be useful?  In the unlikely event that I
actually come up with a brilliant and simple solution, I'd at least like
to make sure that I'm solving the right problem :)

Thanks,
Mark Lewis



On Tue, 2006-10-10 at 22:38 -0400, Tom Lane wrote:
 Brian Herlihy [EMAIL PROTECTED] writes:
  What would it take for hints to be added to postgres?
 
 A *whole lot* more thought and effort than has been expended on the
 subject to date.
 
 Personally I have no use for the idea of force the planner to do
 exactly X given a query of exactly Y.  You don't have exactly Y
 today, tomorrow, and the day after (if you do, you don't need a
 hint mechanism at all, you need a mysql-style query cache).
 IMHO most of the planner mistakes we see that could be fixed via
 hinting are really statistical estimation errors, and so the right
 level to be fixing them at is hints about how to estimate the number
 of rows produced for given conditions.  Mind you that's still a plenty
 hard problem, but you could at least hope that a hint of that form
 would be useful for more than one query.
 
   regards, tom lane
 
 ---(end of broadcast)---
 TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] Unsubscribe

2006-10-04 Thread Mark Lewis
I'd prefer to have a short footer link called something like Mailing
List Page which would take you to a page where you could subscribe,
unsubscribe, or view the archives.  I think that making the link short
and also making it a quick shortcut away from the archives tips the
scales in terms of utility vs. annoyance.  One of the tips that shows up
in the footers today is just a link to the archives anyway.

-- Mark Lewis

On Wed, 2006-10-04 at 11:28 -0500, Bruno Wolff III wrote:
 On Wed, Oct 04, 2006 at 08:30:03 -0700,
   Joshua D. Drake [EMAIL PROTECTED] wrote:
  
  Although I 100% agree with you Bruno, it should be noted that our lists
  are a closed box for most people. They don't follow what is largely
  considered standard amongst lists which is to have list information at
  the bottom of each e-mail.
 
 There are reasons you don't want to do that. Footers work OK for single
 part email messages. They don't make so much sense in multipart messages.
 You can probably take a crap shoot and add the footer to the first
 text/plain part and not break things. This won't work so well for multipart
 alternative messages that have text/plain and text/html parts. You could
 also try to insert a footer in to the html part, but thats a bit trickier
 since you can't just put it at the end.
 
 However, since the postgres lists are mostly just using text/plain parts
 for message bodies and there are already footers being used to distribute
 tips, it wouldn't make things significantly worse to add unsubscribe
 information as well.
 
 I would prefer just making the unsubscribe instructions easy to find on
 the web.
 
 ---(end of broadcast)---
 TIP 5: don't forget to increase your free space map settings

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] BUG #2658: Query not using index

2006-10-03 Thread Mark Lewis
Have you looked into a materialized view sort of approach?  You could
create a table which had assetid as a primary key, and max_ts as a
column.  Then use triggers to keep that table up to date as rows are
added/updated/removed from the main table.

This approach would only make sense if there were far fewer distinct
assetid values than rows in the main table, and would get slow if you
commonly delete rows from the main table or decrease the value for ts in
the row with the highest ts for a given assetid.

-- Mark Lewis

On Tue, 2006-10-03 at 13:52 -0700, Graham Davis wrote:
 Thanks Tom, that explains it and makes sense.  I guess I will have to 
 accept this query taking 40 seconds, unless I can figure out another way 
 to write it so it can use indexes.  If there are any more syntax 
 suggestions, please pass them on.  Thanks for the help everyone.
 
 Graham.
 
 
 Tom Lane wrote:
 
 Graham Davis [EMAIL PROTECTED] writes:
   
 
 How come an aggreate like that has to use a sequential scan?  I know 
 that PostgreSQL use to have to do a sequential scan for all aggregates, 
 but there was support added to version 8 so that aggregates would take 
 advantage of indexes.
 
 
 
 Not in a GROUP BY context, only for the simple case.  Per the comment in
 planagg.c:
 
   * We don't handle GROUP BY, because our current implementations of
   * grouping require looking at all the rows anyway, and so there's not
   * much point in optimizing MIN/MAX.
 
 The problem is that using an index to obtain the maximum value of ts for
 a given value of assetid is not the same thing as finding out what all
 the distinct values of assetid are.
 
 This could possibly be improved but it would take a considerable amount
 more work.  It's definitely not in the category of bug fix.
 
  regards, tom lane
   
 
 
 

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] BUG #2658: Query not using index

2006-10-03 Thread Mark Lewis
Hmmm.  How many distinct assetids are there?
-- Mark Lewis

On Tue, 2006-10-03 at 14:23 -0700, Graham Davis wrote:
 The summary table approach maintained by triggers is something we are 
 considering, but it becomes a bit more complicated to implement.  
 Currently we have groups of new positions coming in every few seconds or 
 less.  They are not guaranteed to be in order.  So for instance, a group 
 of positions from today could come in and be inserted, then a group of 
 positions that got lost from yesterday could come in and be inserted 
 afterwards. 
 
 This means the triggers would have to do some sort of logic to figure 
 out if the newly inserted position is actually the most recent by 
 timestamp.  If positions are ever deleted or updated, the same sort of 
 query that is currently running slow will need to be executed in order 
 to get the new most recent position.  So there is the possibility that 
 new positions can be inserted faster than the triggers can calculate 
 and  maintain the summary table.  There are some other complications 
 with maintaining such a summary table in our system too, but I won't get 
 into those.
 
 Right now I'm just trying to see if I can get the query itself running 
 faster, which would be the easiest solution for now.
 
 Graham.
 
 
 Mark Lewis wrote:
 
 Have you looked into a materialized view sort of approach?  You could
 create a table which had assetid as a primary key, and max_ts as a
 column.  Then use triggers to keep that table up to date as rows are
 added/updated/removed from the main table.
 
 This approach would only make sense if there were far fewer distinct
 assetid values than rows in the main table, and would get slow if you
 commonly delete rows from the main table or decrease the value for ts in
 the row with the highest ts for a given assetid.
 
 -- Mark Lewis
 
 On Tue, 2006-10-03 at 13:52 -0700, Graham Davis wrote:
   
 
 Thanks Tom, that explains it and makes sense.  I guess I will have to 
 accept this query taking 40 seconds, unless I can figure out another way 
 to write it so it can use indexes.  If there are any more syntax 
 suggestions, please pass them on.  Thanks for the help everyone.
 
 Graham.
 
 
 Tom Lane wrote:
 
 
 
 Graham Davis [EMAIL PROTECTED] writes:
  
 
   
 
 How come an aggreate like that has to use a sequential scan?  I know 
 that PostgreSQL use to have to do a sequential scan for all aggregates, 
 but there was support added to version 8 so that aggregates would take 
 advantage of indexes.

 
 
 
 Not in a GROUP BY context, only for the simple case.  Per the comment in
 planagg.c:
 
 * We don't handle GROUP BY, because our current implementations of
 * grouping require looking at all the rows anyway, and so there's not
 * much point in optimizing MIN/MAX.
 
 The problem is that using an index to obtain the maximum value of ts for
 a given value of assetid is not the same thing as finding out what all
 the distinct values of assetid are.
 
 This could possibly be improved but it would take a considerable amount
 more work.  It's definitely not in the category of bug fix.
 
regards, tom lane
  
 
   
 
 
 
 
 

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] BUG #2658: Query not using index

2006-10-03 Thread Mark Lewis
A few hundred is quite a lot for the next proposal and it's kind of an
ugly one, but might as well throw the idea out since you never know.

Have you considered creating one partial index per assetid?  Something
along the lines of CREATE INDEX asset_index_N ON asset_positions(ts)
WHERE assetid=N?  I'd guess that the planner probably wouldn't be smart
enough to use the partial indexes unless you issued a separate query for
each assetid, but each one of those queries should be really fast.  Of
course, this is all assuming that PG knows how to use partial indexes to
satisfy MAX queries; I'm not sure if it does.

-- Mark Lewis

On Tue, 2006-10-03 at 14:35 -0700, Graham Davis wrote:
 Not many.  It fluctuates, but there are usually only ever a few hundred 
 at most.  Each assetid has multi-millions of positions though.
 
 Mark Lewis wrote:
 
 Hmmm.  How many distinct assetids are there?
 -- Mark Lewis
 
 On Tue, 2006-10-03 at 14:23 -0700, Graham Davis wrote:
   
 
 The summary table approach maintained by triggers is something we are 
 considering, but it becomes a bit more complicated to implement.  
 Currently we have groups of new positions coming in every few seconds or 
 less.  They are not guaranteed to be in order.  So for instance, a group 
 of positions from today could come in and be inserted, then a group of 
 positions that got lost from yesterday could come in and be inserted 
 afterwards. 
 
 This means the triggers would have to do some sort of logic to figure 
 out if the newly inserted position is actually the most recent by 
 timestamp.  If positions are ever deleted or updated, the same sort of 
 query that is currently running slow will need to be executed in order 
 to get the new most recent position.  So there is the possibility that 
 new positions can be inserted faster than the triggers can calculate 
 and  maintain the summary table.  There are some other complications 
 with maintaining such a summary table in our system too, but I won't get 
 into those.
 
 Right now I'm just trying to see if I can get the query itself running 
 faster, which would be the easiest solution for now.
 
 Graham.
 
 
 Mark Lewis wrote:
 
 
 
 Have you looked into a materialized view sort of approach?  You could
 create a table which had assetid as a primary key, and max_ts as a
 column.  Then use triggers to keep that table up to date as rows are
 added/updated/removed from the main table.
 
 This approach would only make sense if there were far fewer distinct
 assetid values than rows in the main table, and would get slow if you
 commonly delete rows from the main table or decrease the value for ts in
 the row with the highest ts for a given assetid.
 
 -- Mark Lewis
 
 On Tue, 2006-10-03 at 13:52 -0700, Graham Davis wrote:
  
 
   
 
 Thanks Tom, that explains it and makes sense.  I guess I will have to 
 accept this query taking 40 seconds, unless I can figure out another way 
 to write it so it can use indexes.  If there are any more syntax 
 suggestions, please pass them on.  Thanks for the help everyone.
 
 Graham.
 
 
 Tom Lane wrote:
 

 
 
 
 Graham Davis [EMAIL PROTECTED] writes:
 
 
  
 
   
 
 How come an aggreate like that has to use a sequential scan?  I know 
 that PostgreSQL use to have to do a sequential scan for all aggregates, 
 but there was support added to version 8 so that aggregates would take 
 advantage of indexes.
   
 

 
 
 
 Not in a GROUP BY context, only for the simple case.  Per the comment in
 planagg.c:
 
   * We don't handle GROUP BY, because our current implementations of
   * grouping require looking at all the rows anyway, and so there's not
   * much point in optimizing MIN/MAX.
 
 The problem is that using an index to obtain the maximum value of ts for
 a given value of assetid is not the same thing as finding out what all
 the distinct values of assetid are.
 
 This could possibly be improved but it would take a considerable amount
 more work.  It's definitely not in the category of bug fix.
 
  regards, tom lane
 
 
  
 
   
 

 
 
 
 
 
 
 

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Decreasing BLKSZ

2006-09-25 Thread Mark Lewis
I'm not sure if decreasing BLKSZ is the way to go.  It would allow you
to have more smaller blocks in memory, but the actual coverage of the
index would remain the same; if only 33% of the index fits in memory
with the 8K BLKSZ then only 33% would fit in memory with a 4k BLKSZ.  I
can see where you're going if the tree nodes for all 15 million key
entries fit in memory as well as the most recent nodes for the logtime
nodes lower down in the index; basically trying to make sure that the
right 33% of the index is in memory. 

But it seems like it might be more useful to have two indexes, one on
logtime and one on key.  Inserts into the logtime index would be
correlated with your insert order and as such be cache-friendly so
that's not an issue.  The index on just the key column would be at least
as small as the active subset of a combined index, so performance should
be at least as good as you could possibly achieve by reducing BLKSIZE.

PG 8.1 is smart enough to use a bitmap index scan to combine the two
indexes at query time; if that gives you adequate performance then it
would be simpler than reducing BLKSIZE.

-- Mark Lewis

On Mon, 2006-09-25 at 17:54 -0400, Marc Morin wrote:
  Would it be possible to change the primary key to 
  (logtime,key)? This could help keeping the working window small.
 
 No, the application accessing the data wants all the rows between start
 and end time for a particular key value.  
 
  
  Secondly, the real working set is smaller, as the rows are 
  all inserted at the end of the table, filling each page until 
  it's full, so only the last pages are accessed. There's no 
  relation between the index order, and the order of data on 
  disk, unless you CLUSTER.
 
 I'd theorizing that my problem is in updating the index itself and not
 the heap.  Insert order
 Refers to the order by which the applications are inserting the rows and
 as such, the order by
 Which the index is being updated.  This in turn, is causing the b-tree
 to be traverse.  Problem
 Is the working set of blocks at the bottom of the btree is too big for
 my cache.
 
  
   Any comment on other affects or gotchas with lowering the size of 
   BLKSZ?  Currently, our database is thrashing its cache of blocks we 
   we're getting only ~100 inserts/second, every insert results in a 
   evict-read-modify operation.
  
  I'm not shure that's the correct diagnosis.
  
  Do you have one transaction per insert? Every transaction 
  means a forced sync to the disk, so you won't get more than 
  about 100-200 commits per second, depending on your actual 
  disk rotation speed.
 
 No, an insert consists of roughly 10,000+ rows per transaction block.  
 
  
  To improve concurrency of the numer of inserters running in 
  parallel, try to tweak the config variables commit_delay and 
  commit_sibling, so you get a higher overall throughput at 
  cost of an increased delay per connection, and increase the 
  number of inserters. Using sensible tweaking, the throughput 
  should scale nearly linear with the number of backens. :-)
 
 I don't think this will help us here due to large transactions already.
 
  
  If feasible for your application, you can also bundle several 
  log entries into a single transaction. If you're CPU bound, 
  you can use COPY instead of INSERT or (if you can wait for 
  8.2) the new multi-row INSERT to further improve performance, 
  but I doubt that you're CPU bound.
 
  
  The only way to really get over the sync limit is to have 
  (at least) the WAL on a battery backed ram / SSD media that 
  has no spinning disk
  physical limit, or abandon crash safety by turning fsync off.
 
 Again, problem is not with WAL writing, already on it's own raid1 disk
 pair.  The 
 I/O pattern we see is about 1-2% load on WAL and 100% load on the array
 holding the indexes and tables. Throughput is very low, something like
 150k-200K bytes/second of real  rows being deposited on the disk.
 
 The disks are busy seeking all over the disk platter to fetch a block,
 add a single row, then seek to another spot and write back a previously
 dirty buffer
 
  
  Thanks,
  Markus.
  --
  Markus Schaber | Logical TrackingTracing International AG
  Dipl. Inf. | Software Development GIS
  
  Fight against software patents in Europe! www.ffii.org 
  www.nosoftwarepatents.org
  
 
 ---(end of broadcast)---
 TIP 2: Don't 'kill -9' the postmaster

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] PostgreSQL and sql-bench

2006-09-21 Thread Mark Lewis
Not to offend, but since most of us are PG users, we're not all that
familiar with what the different tests in MySQL's sql-bench benchmark
do.  So you won't get very far by saying PG is slow on benchmark X, can
I make it faster?, because that doesn't include any of the information
we need in order to help.

Specifics would be nice, including at least the following:

1. Which specific test case(s) would you like to try to make faster?
What do the table schema look like, including indexes and constraints?

2. What strategy did you settle on for handling VACUUM and ANALYZE
during the test?  Have you confirmed that you aren't suffering from
table bloat?

3. What are the actual results you got from the PG run in question?

4. What is the size of the data set referenced in the test run?

-- Mark Lewis

On Thu, 2006-09-21 at 07:52 -0700, yoav x wrote:
 Hi
 
 After upgrading DBI and DBD::Pg, this benchmark still picks MySQL as the 
 winner (at least on Linux
 RH3 on a Dell 1875 server with 2 hyperthreaded 3.6GHz CPUs and 4GB RAM).
 I've applied the following parameters to postgres.conf:
 
 max_connections = 500
 shared_buffers = 3000
 work_mem = 10
 effective_cache_size = 30
 
 Most queries still perform slower than with MySQL. 
 Is there anything else that can be tweaked or is this a limitation of PG or 
 the benchmark?
 
 Thanks.
 
 
 
 __
 Do You Yahoo!?
 Tired of spam?  Yahoo! Mail has the best spam protection around 
 http://mail.yahoo.com 
 
 ---(end of broadcast)---
 TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Large tables (was: RAID 0 not as fast as

2006-09-21 Thread Mark Lewis
 So this might be a dumb question, but the above statements apply to the
 cluster (e.g. postmaster) as a whole, not per postgres
 process/transaction correct? So each transaction is blocked waiting for
 the main postmaster to retrieve the data in the order it was requested
 (i.e. not multiple scanners/aio)?

Each connection runs its own separate back-end process, so these
statements apply per PG connection (=process).

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Optimizing DELETE

2006-09-19 Thread Mark Lewis
You do not have indexes on all of the columns which are linked by
foreign key constraints.

For example, let's say that I had a scientist table with a single
column scientist_name and another table discovery which had
scientist_name as a column with a foreign key constraint to the
scientist table.

If the system were to try to delete a row from the scientist table, then
it would need to scan the discovery table for any row which referenced
that scientist_name.

If there is an index on the scientist_name column in the discovery
table, this is a fast operation.  In your case however, there most
likely isn't an index on that column, so it needs to do a full table
scan of the discovery table for each row deleted from the scientist
table.

If the discovery table has 100,000 rows, and there are 100 scientists,
then deleting those 100 scientists would require scanning 100,000 * 100
= 10M records, so this sort of thing can quickly become a very expensive
operation.

Because of this potential for truly atrocious update/delete behavior,
some database systems (SQL Server at least, and IIRC Oracle as well)
either automatically create the index on discovery.scientist_name when
the foreign key constraint is created, or refuse to create the foreign
key constraint if there isn't already an index.

PG doesn't force you to have an index, which can be desirable for
performance reasons in some situations if you know what you're doing,
but allows you to royally shoot yourself in the foot on deletes/updates
to the parent table if you're not careful.

If you have a lot of constraints and want to track down which one is
unindexed, then doing an EXPLAIN ANALYZE of deleting a single row from
the parent table will tell you how long each of the referential
integrity checks takes, so you can figure out which indexes are missing.

-- Mark Lewis

On Tue, 2006-09-19 at 15:22 +0200, Ivan Voras wrote:
 I've just fired off a DELETE FROM table command (i.e. unfiltered 
 DELETE) on a trivially small table but with many foreign key references 
 (on similar-sized tables), and I'm waiting for it to finish. It's been 
 10 minutes now, which seems very excessive for a table of 9000 rows on a 
 3 GHz desktop machine.
 
 'top' says it's all spent in USER time, and there's a ~~500KB/s write 
 rate going on. Just before this DELETE, I've deleted data from a larger 
 table (5 rows) using the same method and it finished in couple of 
 seconds - maybe it's a PostgreSQL bug?
 
 My question is: assuming it's not a bug, how to optimize DELETEs? 
 Increasing work_mem maybe?
 
 (I'm using PostgreSQL 8.1.4 on FreeBSD 6- amd64)
 
 (I know about TRUNCATE; I need those foreign key references to cascade)
 
 ---(end of broadcast)---
 TIP 6: explain analyze is your friend

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] sql-bench

2006-09-13 Thread Mark Lewis
The last I checked (years ago), sql-bench was very synthetic (i.e.
reflecting no realistic use case).  It's the sort of test suite that's
useful for database developers when testing the effects of a particular
code change or optimization, but not so applicable to real-world uses.  

Historically the test was also bad for PG because it did nasty things
like 10,000 inserts each in separate transactions because the test was
written for MySQL which at the time didn't support transactions.  Not
sure if that's been fixed yet or not.

Can you provide details about the schema and the queries that are slow?

-- Mark

On Wed, 2006-09-13 at 05:24 -0700, yoav x wrote:
 Hi
 
 I am trying to run sql-bench against PostgreSQL 8.1.4 on Linux.
 Some of the insert tests seems to be ver slow 
 
 For example: select_join_in
 
 Are there any tuning parameters that can be changed to speed these queries? 
 Or are these queries
 especially tuned to show MySQL's stgrenths?
 
 
 
 
 __
 Do You Yahoo!?
 Tired of spam?  Yahoo! Mail has the best spam protection around 
 http://mail.yahoo.com 
 
 ---(end of broadcast)---
 TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Is this way of testing a bad idea?

2006-08-24 Thread Mark Lewis
 Monitoring the processes using top reveals that the total amount of
 memory used slowly increases during the test. When reaching insert
 number 4, or somewhere around that, memory is exhausted, and the the
 systems begins to swap. Each of the postmaster processes seem to use a
 constant amount of memory, but the total memory usage increases all the
 same.

So . . . . what's using the memory?  It doesn't sound like PG is using
it, so is it your Java app?

If it's the Java app, then it could be that your code isn't remembering
to do things like close statements, or perhaps the max heap size is set
too large for your hardware.  With early RHEL3 kernels there was also a
quirky interaction with Sun's JVM where the system swaps itself to death
even when less than half the physical memory is in use.

If its neither PG nor Java, then perhaps you're misinterpreting the
results of top.  Remember that the free memory on a properly running
Unix box that's been running for a while should hover just a bit above
zero due to normal caching; read up on the 'free' command to see the
actual memory utilization.

-- Mark

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] PowerEdge 2950 questions

2006-08-24 Thread Mark Lewis
 it's worse than that.  if you need to read something that is not in
 the o/s cache, all the disks except for one need to be sent to a
 physical location in order to get the data.  Thats the basic rule with
 striping: it optimizes for sequential i/o in expense of random i/o.
 There are some optimizations that can help, but not much.  caching by
 the controller can increase performance on writes because it can
 optimize the movement across the disks by instituting a delay between
 the write request and the actual write.
 
 raid 1 (or 1+x) is the opposite.  It allows the drive heads to move
 independantly on reads when combined with some smart algorithms.
 writes however must involve all the disk heads however.  Many
 controllers do not to seem to optimze raid 1 properly although linux
 software raid seems to.
 
 A 4 disk raid 1, for example, could deliver four times the seek
 performance which would make it feel much faster than a 4 disk raid 0
 under certain conditions.

I understand random mid-sized seeks (seek to x and read 512k) being slow
on RAID5, but if the read size is small enough not to cross a stripe
boundary, this could be optimized to only one seek on one drive.  Do
most controllers just not do this, or is there some other reason that
I'm not thinking of that would force all disks to seek?

-- Mark

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] How to get higher tps

2006-08-22 Thread Mark Lewis
Well, at least on my test machines running gnome-terminal, my pgbench
runs tend to get throttled by gnome-terminal's lousy performance to no
more than 300 tps or so.  Running with 2/dev/null to throw away all the
detailed logging gives me 2-3x improvement in scores.  Caveat: in my
case the db is on the local machine, so who knows what all the
interactions are.

Also, when you initialized the pgbench db what scaling factor did you
use?  And does running pgbench with -v improve performance at all?

-- Mark

On Tue, 2006-08-22 at 09:19 -0400, Marty Jia wrote:
 Joshua,
 
 Here is 
 
 shared_buffers = 8
 fsync = on
 max_fsm_pages = 35
 max_connections = 1000
 work_mem = 65536
 effective_cache_size = 61
 random_page_cost = 3
  
 Here is pgbench I used:
 
 pgbench -c 10 -t 1 -d HQDB
 
 Thanks
 
 Marty
 
 -Original Message-
 From: Joshua D. Drake [mailto:[EMAIL PROTECTED] 
 Sent: Monday, August 21, 2006 6:09 PM
 To: Marty Jia
 Cc: pgsql-performance@postgresql.org
 Subject: Re: [PERFORM] How to get higher tps
 
 Marty Jia wrote:
  I'm exhausted to try all performance tuning ideas, like following 
  parameters
  
  shared_buffers
  fsync
  max_fsm_pages
  max_connections
  shared_buffers
  work_mem
  max_fsm_pages
  effective_cache_size
  random_page_cost
  
  I believe all above have right size and values, but I just can not get
 
  higher tps more than 300 testd by pgbench
 
 What values did you use?
 
  
  Here is our hardware
  
  
  Dual Intel Xeon 2.8GHz
  6GB RAM
  Linux 2.4 kernel
  RedHat Enterprise Linux AS 3
  200GB for PGDATA on 3Par, ext3
  50GB for WAL on 3Par, ext3
  
  With PostgreSql 8.1.4
  
  We don't have i/o bottle neck. 
 
 Are you sure? What does iostat say during a pgbench? What parameters are
 you passing to pgbench?
 
 Well in theory, upgrading to 2.6 kernel will help as well as making your
 WAL ext2 instead of ext3.
 
  Whatelse I can try to better tps? Someone told me I can should get tps
 
  over 1500, it is hard to believe.
 
 1500? Hmmm... I don't know about that, I can get 470tps or so on my
 measily dual core 3800 with 2gig of ram though.
 
 Joshua D. Drake
 
 
  
  Thanks
  
  Marty
  
  ---(end of 
  broadcast)---
  TIP 2: Don't 'kill -9' the postmaster
  
 
 

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] How to get higher tps

2006-08-21 Thread Mark Lewis
Not much we can do unless you give us more info about how you're testing
(pgbench setup), and what you've done with the parameters you listed
below.  It would also be useful if you told us more about your drive
array than just 3Par.  We need to know the RAID level, number/speed of
disks, whether it's got a battery-backed write cache that's turned on,
things like this.

Like Jeff just said, it's likely that you're waiting for rotational
latency, which would limit your maximum tps for sequential jobs based on
the number of disks in your array.  For example, a 2-disk array of 10k
RPM disks is going to max out somewhere around 333 tps.  (2*1/60).

-- Mark Lewis

 

On Mon, 2006-08-21 at 16:45 -0400, Marty Jia wrote:
 I'm exhausted to try all performance tuning ideas, like following
 parameters
 
 shared_buffers 
 fsync 
 max_fsm_pages
 max_connections 
 shared_buffers  
 work_mem 
 max_fsm_pages
 effective_cache_size 
 random_page_cost 
 
 I believe all above have right size and values, but I just can not get
 higher tps more than 300 testd by pgbench
 
 Here is our hardware
 
 
 Dual Intel Xeon 2.8GHz
 6GB RAM
 Linux 2.4 kernel
 RedHat Enterprise Linux AS 3
 200GB for PGDATA on 3Par, ext3
 50GB for WAL on 3Par, ext3
 
 With PostgreSql 8.1.4
 
 We don't have i/o bottle neck. 
 
 Whatelse I can try to better tps? Someone told me I can should get tps
 over 1500, it is hard to believe.
 
 Thanks
 
 Marty
 
 ---(end of broadcast)---
 TIP 2: Don't 'kill -9' the postmaster

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] PostgreSQL runs a query much slower than BDE and

2006-08-17 Thread Mark Lewis
MSSQL can give either a graphical query plan or a text-based one similar
to PG.  There's no way that I've found to get the equivalent of an
EXPLAIN ANALYZE, but I'm by no means an MSSQL guru.

To get a neat-looking but not very useful graphical query plan from the
Query Analyzer tool, hit Ctrl-L.

To get the text-based one, execute SET SHOWPLAN_ALL ON which toggles
diagnostic mode on, and each query that you run will return the explain
plan instead of actually running until you execute SET SHOWPLAN_ALL
OFF.

-- Mark Lewis

On Thu, 2006-08-17 at 09:11 -0400, Tom Lane wrote:
 Peter Hardman [EMAIL PROTECTED] writes:
  I wonder whether Paradox and MySQL are just not doing the sort (this
  seems to be what eats up the time), since the output of the subquery
  is in fact already in the proper order.
 
 MSSQL (from the other thread).  I feel fairly safe in assuming that
 MySQL's query optimizer is not nearly in the league to do this query
 effectively.  (I like the theory Arjen mentioned that what you are
 measuring there is the effects of their query cache rather than a
 smart fundamental implementation.)  I wonder whether MSSQL has an
 EXPLAIN equivalent ...
 
 Anywy, your point about the sort being redundant is a good one, and
 offhand I'd have expected PG to catch that; I'll have to look into
 why it didn't.  But that's not going to explain a 10x speed
 difference, because the sort isn't 90% of the runtime.
 
   regards, tom lane
 
 ---(end of broadcast)---
 TIP 4: Have you searched our list archives?
 
http://archives.postgresql.org

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] Inner Join of the same table

2006-08-15 Thread Mark Lewis
Can you provide an EXPLAIN ANALYZE of the query in PG?  Have you
analyzed the PG database?  How many rows is this query expected to
return?  Which version of PG are you running?  What indexes have you
defined?

-- Mark

On Tue, 2006-08-15 at 14:38 +, Sebastián Baioni wrote:
 Hello, I'm migrating from MS SQL Server to PostgreSQL 8.1 and I have a
 serious problem:
 Table: APORTES - Rows: 9,000,000 (9 million)
 *cuiT (char 11)
 *cuiL (char 11)
 *PERI (char 6)
 FAMI (numeric 6)
 
 I need all the cuiLs whose max(PERI) are from a cuiT, and the Max
 (FAMI) of those cuiLs, so the sentence is:
 
 SELECT DISTINCT T.cuiT, T.cuiL. U.MAXPERI, U.MAXFAMI
FROM APORTES T
INNER JOIN
(SELECT cuiL, MAX(PERI) AS MAXPERI,
MAX(FAMI) AS MAXFAMI
 FROM APORTES
 GROUP BY cuiL) AS U
ON T.cuiL = U.cuiL AND T.PERI=U.MAXPERI
 WHERE T.cuiT='12345678901'
 
 In MS SQL Server it lasts 1minute, in PostgreSQL for Windows it lasts
 40minutes and in PostgreSQL for Linux (FreeBSD) it lasts 20minuts.
 
 Do you know if there is any way to tune the server or optimize this
 sentence?
 
 Thanks
  Sebastián Baioni
 
 Instrumentos musicalesSebastián Baioni Ofertas náuticas
 
 
 __
 Preguntá. Respondé. Descubrí.
 Todo lo que querías saber, y lo que ni imaginabas,
 está en Yahoo! Respuestas (Beta).
 Probalo ya! 

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Performance with 2 AMD/Opteron 2.6Ghz and 8gig

2006-07-28 Thread Mark Lewis
This isn't all that surprising.  The main weaknesses of RAID-5 are poor
write performance and stupid hardware controllers that make the write
performance even worse than it needs to be.  Your numbers bear that out.
Reads off RAID-5 are usually pretty good.

Your 'dd' test is going to be a little misleading though.  Most DB
access isn't usually purely sequential; while it's easy to see why HW
RAID-5 might outperform HW-RAID-10 in large sequential reads (the RAID
controller would need to be smarter than most to make RAID-10 as fast as
RAID-5), I would expect that HW RAID-5 and RAID-10 random reads would be
about equal or else maybe give a slight edge to RAID-10. 

-- Mark Lewis


On Fri, 2006-07-28 at 13:31 -0400, Jeff Trout wrote:
 I too have a DL385 with a single DC Opteron 270.
 It claims to have a smart array 6i controller and over the last  
 couple of days I've been runnign some tests on it, which have been  
 yielding some suprising results.
 
 I've got 6 10k U320 disks in it. 2 are in a mirror set.  We'll not  
 pay any attention to them.
 The remaining 4 disks I've been toying with to see what config works  
 best, using hardware raid and software raid.
 
 system info:
 dl dl385 - 1 opteron 270 - 5GB ram - smart array 6i
 cciss0: HP Smart Array 6i Controller
 Firmware Version: 2.58
 Linux db03 2.6.17-1.2157_FC5 #1 SMP Tue Jul 11 22:53:56 EDT 2006  
 x86_64 x86_64 x86_64 GNU/Linux
 using xfs
 
 Each drive can sustain 80MB/sec read (dd, straight off device)
 
 So here are the results I have so far.  (averaged)
 
 
 hardware raid 5:
 dd - write 20GB file - 48MB/sec
 dd - read 20GB file - 247MB/sec
 [ didn't do a bonnie run on this yet ]
 pretty terrible write performance. good read.
 
 hardware raid 10
 dd - write 20GB - 104MB/sec
 dd - read 20GB - 196MB/sec
 bonnie++
 Version  1.03   --Sequential Output-- --Sequential Input-  
 --Random-
  -Per Chr- --Block-- -Rewrite- -Per Chr- -- 
 Block-- --Seeks--
 MachineSize K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec % 
 CP  /sec %CP
 db03  9592M 45830  97 129501  31 62981  14 48524  99 185818   
 19 949.0   1
 
 software raid 5
 dd - write 20gb - 85MB/sec
 dd - read 20gb - 135MB/sec
 
 I was very suprised at those results. I was sort of expecting it to  
 smoke the hardware. I repeated the test many times, and kept getting
 these numbers.
 
 bonnie++:
 Version  1.03   --Sequential Output-- --Sequential Input-  
 --Random-
  -Per Chr- --Block-- -Rewrite- -Per Chr- -- 
 Block-- --Seeks--
 MachineSize K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec % 
 CP  /sec %CP
 db03  9592M 44110  97 81481  23 34604  10 44495  95 157063   
 28 919.3   1
 
 software 10:
 dd - write - 20GB - 108MB/sec
 dd - read - 20GB - 86MB/sec( WTF? - this is repeatable!!)
 bonnie++
 Version  1.03   --Sequential Output-- --Sequential Input-  
 --Random-
  -Per Chr- --Block-- -Rewrite- -Per Chr- -- 
 Block-- --Seeks--
 MachineSize K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec % 
 CP  /sec %CP
 db03  9592M 44539  98 105444  20 34127   8 39830  83 100374   
 10  1072   1
 
 
 so I'm going to be going with hw r5, which went against what I  
 thought going in - read perf is more important for my usage than write.
 
 I'm still not sure about that software 10 read number. something is  
 not right there...
 
 --
 Jeff Trout [EMAIL PROTECTED]
 http://www.dellsmartexitin.com/
 http://www.stuarthamm.net/
 
 
 
 
 ---(end of broadcast)---
 TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] index usage

2006-07-28 Thread Mark Lewis
A volatile function has may return a different result for each row;
think of the random() or nextval() functions for example.  You wouldn't
want them to return the same value for each row returned.

-- Mark Lewis

On Fri, 2006-07-28 at 13:59 -0700, Ben wrote:
 It's volatile, but it will always return an integer.
 
 On Fri, 28 Jul 2006, Daniel Caune wrote:
 
  De : [EMAIL PROTECTED] [mailto:pgsql-performance-
  [EMAIL PROTECTED] De la part de Ben
  Envoyé : vendredi, juillet 28, 2006 15:21
  À : pgsql-performance@postgresql.org
  Objet : [PERFORM] index usage
 
  I have a table with 37000 rows, an integer column, and an index on that
  column. I've got a function that returns an integer. When I do a select
  where I restrict that column to being equal to a static number, explain
  tells me the index will be used. When I do the same thing but use the
  function instead of a static number, explain shows me a full scan on the
  table.
 
  I must be missing something, because my understanding is that the function
  will be evaluated once for the statement and then collapsed into a static
  number for the filtering. But the results of the explain seem to imply
  that's not the case?
 
 
  Is your function IMMUTABLE, STABLE or VOLATILE?
 
  --
  Daniel
 
 ---(end of broadcast)---
 TIP 5: don't forget to increase your free space map settings

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


[PERFORM] Savepoint performance

2006-07-27 Thread Mark Lewis
All,

I support a system that runs on several databases including PostgreSQL.
I've noticed that the other DB's always put an implicit savepoint before
each statement executed, and roll back to that savepoint if the
statement fails for some reason.  PG does not, so unless you manually
specify a savepoint you lose all previous work in the transaction.

So my question is, how expensive is setting a savepoint in PG?  If it's
not too expensive, I'm wondering if it would be feasible to add a config
parameter to psql or other client interfaces (thinking specifically of
jdbc here) to do it automatically.  Doing so would make it a little
easier to work with PG in a multi-db environment.

My main reason for wanting this is so that I can more easily import,
say, 50 new 'objects' (related rows stored across several tables) in a
transaction instead of only one at a time without fear that an error in
one object would invalidate the whole batch.  I could do this now by
manually setting savepoints, but if it's not a big deal performance-wise
to modify the JDBC driver to start an anonymous savepoint with each
statement, then I'd prefer that approach as it seems that it would make
life easier for other folks too.

Thanks in advance for any feedback :)

-- Mark Lewis

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] Commit slower on faster PC

2006-07-12 Thread Mark Lewis
The IDE drive is almost certainly lying about flushing data to the disk.
Lower-end consumer drives often do.

What this means is that commits will be a whole lot faster, but the
database loses its ACID guarantees, because a power failure at the wrong
moment could corrupt the whole database.

If you don't care about your data and want the SCSI drive to perform
fast just like the IDE drive, you can set fsync = off in your
configuration file.

-- Mark

On Wed, 2006-07-12 at 10:16 -0600, Koth, Christian (DWBI) wrote:
 Hi,
 
 please help me with the following problem:
 
 I have noticed a strange performance behaviour using a commit statement on 
 two different machines. On one of the machines the commit is many times 
 faster than on the other machine which has faster hardware. Server and client 
 are running always on the same machine.
 
 Server version (same on both machines): PostgreSQL 8.1.3. (same binaries as 
 well)
 
 PC1:
 
 Pentium 4 (2.8 GHz)
 1GB RAM
 IDE-HDD (approx. 50 MB/s rw), fs: ext3
 Mandrake Linux: Kernel 2.4.22
 
 
 PC2:
 
 Pentium 4 (3.0 GHz)
 2GB RAM
 SCSI-HDD (approx. 65 MB/s rw), fs: ext3
 Mandrake Linux: Kernel 2.4.32
 
 
 Both installations of the database have the same configuration, different 
 from default are only the following settings on both machines:
 
 shared_buffers = 2
 listen_addresses = '*'
 max_stack_depth = 4096
 
 
 pgbench gives me the following results:
 PC1:
 
 transaction type: TPC-B (sort of)
 scaling factor: 1
 number of clients: 1
 number of transactions per client: 10
 number of transactions actually processed: 10/10
 tps = 269.905533 (including connections establishing)
 tps = 293.625393 (excluding connections establishing)
 
 PC2:
 
 transaction type: TPC-B (sort of)
 scaling factor: 1
 number of clients: 1
 number of transactions per client: 10
 number of transactions actually processed: 10/10
 tps = 46.061935 (including connections establishing)
 tps = 46.519634 (excluding connections establishing)
 
 
 My own performance test sql script which inserts and (auto)commits some data 
 into a simple table produces the following log output in the server log:
 
 PC1:
 
 LOG:  duration: 1.441 ms  statement: INSERT INTO performance_test VALUES 
 (500938362, 'Xawhefjmd');
 STATEMENT:  INSERT INTO performance_test VALUES (500938362, 'Xawhefjmd');
 
 PC2:
 
 LOG:  duration: 29.979 ms  statement: INSERT INTO performance_test VALUES 
 (500938362, 'Xawhefjmd');
 STATEMENT:  INSERT INTO performance_test VALUES (500938362, 'Xawhefjmd');
 
 
 I created a 'strace' one both machines which is interesting:
 
 Opening the socket:
 ---
 PC1: socket(PF_INET, SOCK_STREAM, IPPROTO_IP) = 10 0.21
 PC2: socket(PF_INET, SOCK_STREAM, IPPROTO_IP) = 8 0.15
 
 PC1: bind(10, {sa_family=AF_INET, sin_port=htons(0), 
 sin_addr=inet_addr(0.0.0.0)}, 16) = 0 0.07
 PC2: bind (8, {sin_family=AF_INET, sin_port=htons(0), 
 sin_addr=inet_addr(0.0.0.0)}}, 16) = 0 0.07
 
 PC1: getsockname(10, {sa_family=AF_INET, sin_port=htons(32820), 
 sin_addr=inet_addr(0.0.0.0)}, [16]) = 0 0.05
 PC2: getsockname( 8, {sin_family=AF_INET, sin_port=htons(36219), 
 sin_addr=inet_addr(0.0.0.0)}}, [16]) = 0 0.05
 
 PC1: connect(10, {sa_family=AF_INET, sin_port=htons(5432), 
 sin_addr=inet_addr(127.0.0.1)}, 16) = 0 0.000440
 PC2: connect( 8, {sin_family=AF_INET, sin_port=htons(5432), 
 sin_addr=inet_addr(127.0.0.1)}}, 16) = 0 0.000394
 
 PC1: setsockopt(10, SOL_TCP, TCP_NODELAY, [1], 4) = 0 0.06
 PC2: setsockopt (8, SOL_TCP, TCP_NODELAY, [1], 4) = 0 0.04
 
 
 Inserting and commiting the data: exec. time
 -
 PC1:
 
 send(10, B\\0\0INSERT INTO performance_test VAL..., 175, 0) = 175 
 0.15
 recv(10, 2\0\17INSERT 0 1\0Z\0\0\0\5T, 8192, 0) = 53 0.07
 send(10, B\0\0\0\17\0S_2\0\0\0\0\0\0\0E\0\0\0\t\0\0\0\0\1S\0\0\0\4, 31, 0) 
 = 31 0.11
 recv(10, 2\0\0\0\4C\0\0\0\vCOMMIT\0Z\0\0\0\5I, 8192, 0) = 23 0.000211
 
 PC2:
 
 send(8, B\\0\0INSERT INTO performance_test VAL..., 175, 0) = 175 
 0.14
 recv(8, 2\0\17INSERT 0 1\0Z\0\0\0\5T, 8192, 0) = 53 0.05
 send(8, B\0\0\0\17\0S_2\0\0\0\0\0\0\0E\0\0\0\t\0\0\0\0\1S\0\0\0\4, 31, 0) = 
 31 0.09
 recv(8, 2\0\0\0\4C\0\0\0\vCOMMIT\0Z\0\0\0\5I, 8192, 0) = 23 0.0253
 
 Every command is a bit faster on PC2 except the last one which is many times 
 slower.
 Any help or hint where to look at would be highly appreciated because I'm 
 running out of ideas ;-).
 
 
 regards,
 Christian
 
 
 **
 The information contained in, or attached to, this e-mail, may contain 
 confidential information and is intended solely for the use of the individual 
 or entity to whom they are addressed and may be subject to legal privilege.  
 If you have received this e-mail in error you should notify the sender 
 immediately by reply e-mail, delete the message from your system and notify 
 your system manager.  Please do not copy it 

Re: [PERFORM] Postgres consuming way too much memory???

2006-06-15 Thread Mark Lewis
On Thu, 2006-06-15 at 11:34 -0400, Tom Lane wrote:
 jody brownell [EMAIL PROTECTED] writes:
  When postgresql starts to go into this bloating state, I can only make it 
  happen from my java app.
 
 That's interesting.  The JDBC driver uses protocol features that aren't
 used by psql, so it's possible that the leak is triggered by one of
 those features.  I wouldn't worry too much about duplicating the problem
 from psql anyway --- a Java test case will do fine.
 
  I am going to try closing the connection after each TX to see if this
  resolves it for now. If not, I will write a java app, stored procedure
  (table etc) reproduce it without our application.


Just to mention another possible culprit; this one doesn't seem all that
likely to me, but at least it's easy to investigate.

With DBCP and non-ancient versions of the JDBC driver that use v3
protocol and real prepared statements, it is possible to (mis)configure
the system to create an unbounded number of cached prepared statements
on any particular connection.  Older versions of DBCP were also known to
have bugs which aggravated this issue when prepared statement caching
was enabled, IIRC.

-- Mark Lewis

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Optimizer internals

2006-06-15 Thread Mark Lewis
On Thu, 2006-06-15 at 14:05 -0400, John Vincent wrote:
 Now I've been told by our DBA that we should have been able to wholy
 satisfy that query via the indexes.

DB2 can satisfy the query using only indexes because DB2 doesn't do
MVCC.

Although MVCC is generally a win in terms of making the database easier
to use and applications less brittle, it also means that the database
must inspect the visibility information for each row before it can
answer a query.  For most types of queries this isn't a big deal, but
for count(*) type queries, it slows things down.

Since adding the visibility information to indexes would make them
significantly more expensive to use and maintain, it isn't done.
Therefore, each row has to be fetched from the main table anyway.

Since in this particular query you are counting all rows of the
database, PG must fetch each row from the main table regardless, so the
sequential scan is much faster because it avoids traversing the index
and performing random read operations.

-- Mark Lewis

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] Optimizer internals

2006-06-15 Thread Mark Lewis
On Thu, 2006-06-15 at 14:46 -0400, John Vincent wrote:

 One question that we came up with is how does this affect other
 aggregate functions like MAX,MIN,SUM and whatnot? Being that this is
 our data warehouse, we use these all the time. As I've said
 previously, I didn't know a human could generate some of the queries
 we've passed through this system. 

Previously, MIN and MAX would also run slowly, for the same reason as
COUNT(*).  But there really isn't a need for that, since you can still
get a big speedup by scanning the index in order, looking up each row
and stopping as soon as you find a visible one.

This has been fixed so newer versions of PG will run quickly and use the
index for MIN and MAX.  I don't remember which version had that change;
it might not be until 8.2.  You can dig the archives to find out for
sure. 

For older versions of PG before the fix, you can make MIN and MAX run
quickly by rewriting them in the following form:

SELECT column FROM table ORDER BY column LIMIT 1;

Unfortunately SUM is in the same boat as COUNT; in order for it to
return a meaningful result it must inspect visibility information for
all of the rows.

-- Mark

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] SAN performance mystery

2006-06-15 Thread Mark Lewis
On Thu, 2006-06-15 at 18:24 -0400, Tom Lane wrote:
 I agree with Brian's suspicion that the SATA drive isn't properly
 fsync'ing to disk, resulting in bogusly high throughput.  However,
 ISTM a well-configured SAN ought to be able to match even the bogus
 throughput, because it should be able to rely on battery-backed
 cache to hold written blocks across a power failure, and hence should
 be able to report write-complete as soon as it's got the page in cache
 rather than having to wait till it's really down on magnetic platter.
 Which is what the SATA drive is doing ... only it can't keep the promise
 it's making for lack of any battery backup on its on-board cache.

It really depends on your SAN RAID controller.  We have an HP SAN; I
don't remember the model number exactly, but we ran some tests and with
the battery-backed write cache enabled, we got some improvement in write
performance but it wasn't NEARLY as fast as an SATA drive which lied
about write completion.

The write-and-fsync latency was only about 2-3 times better than with no
write cache at all.  So I wouldn't assume that just because you've got a
write cache on your SAN, that you're getting the same speed as
fsync=off, at least for some cheap controllers.

-- Mark Lewis

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] Postgres fsync off (not needed) with NetApp

2006-06-14 Thread Mark Lewis
No.  You need fsync on in order to force the data to get TO the NetApp
at the right time.  With fsync off, the data gets cached in the
operating system.

-- Mark Lewis

On Wed, 2006-06-14 at 14:48 -0700, Dan Gorman wrote:
 All,
So I thought I'd pose this question:
 
 If I have a pg database attached to a powervault (PV) with just an  
 off-the-shelf SCSI card I generally want fsync on to prevent data  
 corruption in case the PV should loose power.
 However, if I have it attached to a NetApp that ensures data writes  
 to via the NVRAM can I safely turn fsync off to gain additional  
 performance?
 
 Best Regards,
 Dan Gorman
 
 
 
 ---(end of broadcast)---
 TIP 2: Don't 'kill -9' the postmaster

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Initial database loading and IDE x SCSI

2006-06-02 Thread Mark Lewis
On Fri, 2006-06-02 at 15:25 -0300, [EMAIL PROTECTED] wrote:
 Hi,
 
 I would like to know if my supposition is right.
 
 Considering an environment with only one hard disk attached to a server, an
 initial loading of the database probably is much faster using an IDE/ATA
 interface with write-back on than using an SCSI interface. That´s because of
 the SCSI command interface overhead.

No, it's because the SCSI drive is honoring the database's request to
make sure the data is safe.

 Then main advantage of SCSI interfaces, the multiuser environment is lost in
 this scenery.
 
 Am I right? Am I missing something here?
 
 Even if I´m right, is something that could be done too improove SCSI loading
 performance in this scenery?

You can perform the initial load in large transactions.  The extra
overhead for ensuring that data is safely written to disk will only be
incurred once per transaction, so try to minimize the number of
transactions.

You could optionally set fsync=off in postgresql.conf, which means that
the SCSI drive will operate with no more safety than an IDE drive.  But
you should only do that if you're willing to deal with catastrophic data
corruption.  But if this is for a desktop application where you need to
support IDE drives, you'll need to deal with that anyway.

 Thanks in advance!
 
 Reimer
 
 
 ---(end of broadcast)---
 TIP 6: explain analyze is your friend

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: RES: RES: [PERFORM] Initial database loading and IDE x SCSI

2006-06-02 Thread Mark Lewis
On Fri, 2006-06-02 at 17:37 -0300, [EMAIL PROTECTED] wrote:
 Many thanks Mark,
 
 I will consider fsync=off only to do an initial load, not for a database 
 normal operation.
 

This approach works well.  You just need to remember to shut down the
database and start it back up again with fsync enabled for it to be safe
after the initial load.

 I was just thinking about this hipotetical scenario: 
 a) a restore database operation
 b) fsync off
 c) write-back on (IDE)
 
 As I could understand, in this sceneraio, it´s normal the IDE drive be faster 
 than the SCSI, ok?
 

If fsync is off, then the IDE drive loses its big advantage, so IDE and
SCSI should be about the same speed.

 Of course, the database is exposed because of the fsync=off, but if you 
 consider only the system performance, then it is true. Isn´t it?



 Thanks,
 
 Reimer
 
 

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] is it possible to make this faster?

2006-05-25 Thread Mark Lewis
On Thu, 2006-05-25 at 16:52 -0400, Tom Lane wrote:
 Merlin Moncure [EMAIL PROTECTED] writes:
  been doing a lot of pgsql/mysql performance testing lately, and there
  is one query that mysql does much better than pgsql...and I see it a
  lot in normal development:
 
  select a,b,max(c) from t group by a,b;
 
  t has an index on a,b,c.
 
 The index won't help, as per this comment from planagg.c:
 
* We don't handle GROUP BY, because our current implementations of
* grouping require looking at all the rows anyway, and so there's not
* much point in optimizing MIN/MAX.
 
 Given the numbers you mention (300k rows in 2000 groups) I'm not
 convinced that an index-based implementation would help much; we'd
 still need to fetch at least one record out of every 150, which is
 going to cost near as much as seqscanning all of them.

Well, if the MySQL server has enough RAM that the index is cached (or
index + relevant chunks of data file if using InnoDB?) then that would
explain how MySQL can use an index to get fast results.

-- Mark Lewis

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Getting even more insert performance (250m+rows/day)

2006-05-24 Thread Mark Lewis
 The data gets inserted in batches every 5 minutes and I potentially  
 have people querying it constantly, so I can't remove and re-create  
 the index.

How live does your data need to be?  One possibility would be to use a
separate table for each batch instead of a separate table per day,
create the indexes after the import and only after the indexes have been
created make the table available for user queries.

You'd be trading latency for throughput in that case.

Also, you mentioned that you're CPU-bound, but that you have multiple
CPU's.  In that case, performing N concurrent imports (where N is the
number of processor cores available) might be a win over a single-
threaded import.

-- Mark Lewis

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] performance question (something to do w/

2006-05-08 Thread Mark Lewis
Doing a SELECT with a large list of variables inside an IN runs slowly
on every database we've tested.  We've tested mostly in Oracle and
PostgreSQL, and both get very slow very quickly (actually Oracle refuses
to process the query at all after it gets too many bind parameters).

In our case, we have a (potentially quite large) set of external values
that we want to look up in the database.  We originally thought that
doing a single select with a large IN clause was the way to go, but then
we did some performance analysis on the optimal batch size (number of
items to include per IN clause), and discovered that for most databases,
the optimal batch size was 1.  For PostgreSQL I think it was 2.

The moral of the story is that you're probably better off running a
bunch of small selects than in trying to optimize things with one
gargantuan select.

-- Mark Lewis

On Mon, 2006-05-08 at 13:29 -0400, Jeffrey Tenny wrote:
 Why does this query take so long?  (PostgreSQL 8.0.3, FC4)
 Hopefully I have provided enough information below.
 
 LOG:  statement: SELECT * FROM x WHERE f IN 
 ($1,$2,$3,$4,$5,$6,$7,$8,$9,$10,$11,$12,$13,$14,$15,$16,$17,$18,$19,$20,$21,$22,$23,$24,\
 $25,$26,$27,$28,$29,$30,$31,$32,$33,$34,$35,$36,$37,$38,$39,$40,$41,$42,$43,$44,$45,$46,$47,$48,$49,$50,$51,$52,$53,$54,$55,$56,$57,$58,$59,$60,$61,$62,$63\
 ,$64,$65,$66,$67,$68,$69,$70,$71,$72,$73,$74,$75,$76,$77,$78,$79,$80,$81,$82,$83,$84,$85,$86,$87,$88,$89,$90,$91,$92,$93,$94,$95,$96,$97,$98,$99,$100,$101,\
 $102,$103,$104,$105,$106,$107,$108,$109,$110,$111,$112,$113,$114,$115,$116,$117,$118,$119,$120,$121,$122,$123,$124,$125,$126,$127,$128,$129,$130,$131,$132,\
 $133,$134,$135,$136,$137,$138,$139,$140,$141,$142,$143,$144,$145,$146,$147,$148,$149,$150,$151,$152,$153,$154,$155,$156,$157,$158,$159,$160,$161,$162,$163,\
 $164,$165,$166,$167,$168,$169,$170,$171,$172,$173,$174,$175,$176,$177,$178,$179,$180,$181,$182,$183,$184,$185,$186,$187,$188,$189,$190,$191,$192,$193,$194,\
 $195,$196,$197,$198,$199,$200,$201,$202,$203,$204,$205,$206,$207,$208,$209,$210,$211,$212,$213,$214,$215,$216,$217,$218,$219,$220,$221,$222,$223,$224,$225,\
 $226,$227,$228,$229,$230,$231,$232,$233,$234,$235,$236,$237,$238,$239,$240,$241,$242,$243,$244,$245,$246,$247,$248,$249,$250,$251,$252,$253,$254,$255,$256,\
 $257,$258,$259,$260,$261,$262,$263,$264,$265,$266,$267,$268,$269,$270,$271,$272,$273,$274,$275,$276,$277,$278,$279,$280,$281,$282,$283,$284,$285,$286,$287,\
 $288,$289,$290,$291,$292,$293,$294,$295,$296,$297,$298,$299,$300,$301,$302,$303,$304,$305,$306,$307,$308,$309,$310,$311,$312,$313,$314,$315,$316,$317,$318,\
 $319,$320,$321,$322,$323,$324,$325,$326,$327,$328,$329,$330,$331,$332,$333,$334,$335,$336,$337,$338,$339,$340,$341,$342,$343,$344,$345,$346,$347,$348,$349,\
 $350,$351,$352,$353,$354,$355,$356,$357,$358,$359,$360,$361,$362,$363,$364,$365,$366,$367,$368,$369,$370,$371,$372,$373,$374,$375,$376,$377,$378,$379,$380,\
 $381,$382,$383,$384,$385,$386,$387,$388,$389,$390,$391,$392,$393,$394,$395,$396,$397,$398,$399,$400,$401,$402,$403,$404,$405,$406,$407,$408,$409,$410,$411,\
 $412,$413,$414,$415,$416,$417,$418,$419,$420,$421,$422,$423,$424,$425,$426,$427,$428,$429,$430,$431,$432,$433,$434,$435,$436,$437,$438,$439,$440,$441,$442,\
 $443,$444,$445,$446,$447,$448,$449,$450,$451,$452,$453,$454,$455,$456,$457,$458,$459,$460,$461,$462,$463,$464,$465,$466,$467,$468,$469,$470,$471,$472,$473,\
 $474,$475,$476,$477,$478,$479,$480,$481,$482,$483,$484,$485,$486,$487,$488,$489,$490,$491,$492,$493,$494,$495,$496,$497,$498,$499,$500,$501,$502,$503,$504,\
 $505,$506,$507,$508,$509,$510,$511,$512,$513,$514,$515,$516,$517,$518,$519,$520,$521,$522,$523,$524,$525,$526,$527,$528,$529,$530,$531,$532,$533,$534,$535,\
 $536,$537,$538,$539,$540,$541,$542,$543,$544,$545,$546,$547,$548,$549,$550,$551,$552,$553,$554,$555,$556,$557,$558,$559,$560,$561,$562,$563,$564,$565,$566,\
 $567,$568,$569,$570,$571,$572,$573,$574,$575,$576,$577,$578,$579,$580,$581,$582,$583,$584,$585,$586,$587,$588,$589,$590,$591,$592,$593,$594,$595,$596,$597,\
 $598,$599,$600,$601,$602,$603,$604,$605,$606,$607,$608,$609,$610,$611,$612,$613,$614,$615,$616,$617,$618,$619,$620,$621,$622,$623,$624,$625,$626,$627,$628,\
 $629,$630,$631,$632,$633,$634,$635,$636,$637,$638,$639,$640,$641,$642,$643,$644,$645,$646,$647,$648,$649,$650)
  
 ORDER BY f,c
 
 LOG:  EXECUTOR STATISTICS
 DETAIL:  ! system usage stats:
  !   10.282945 elapsed 10.23 user 0.048992 system sec
  !   [25.309152 user 0.500923 sys total]
  !   0/0 [0/0] filesystem blocks in/out
  !   0/0 [0/10397] page faults/reclaims, 0 [0] swaps
  !   0 [0] signals rcvd, 0/0 [0/0] messages rcvd/sent
  !   0/15 [291/55] voluntary/involuntary context switches
  ! buffer usage stats:
  !   Shared blocks:  0 read,  0 written, 
 buffer hit rate = 100.00%
  !   Local  blocks:  0 read,  0 written, 
 buffer hit rate = 0.00

Re: [PERFORM] Slow restoration question

2006-05-02 Thread Mark Lewis
They are not equivalent.  As I understand it, RAID 0+1 performs about
the same as RAID 10 when everything is working, but degrades much less
nicely in the presence of a single failed drive, and is more likely to
suffer catastrophic data loss if multiple drives fail.

-- Mark

On Tue, 2006-05-02 at 12:40 -0600, Brendan Duddridge wrote:
 Everyone here always says that RAID 5 isn't good for Postgres. We  
 have an Apple Xserve RAID configured with RAID 5. We chose RAID 5  
 because Apple said their Xserve RAID was optimized for RAID 5. Not  
 sure if we made the right decision though. They give an option for  
 formatting as RAID 0+1. Is that the same as RAID 10 that everyone  
 talks about? Or is it the reverse?
 
 Thanks,
 
 
 Brendan Duddridge | CTO | 403-277-5591 x24 |  [EMAIL PROTECTED]
 
 ClickSpace Interactive Inc.
 Suite L100, 239 - 10th Ave. SE
 Calgary, AB  T2G 0V9
 
 http://www.clickspace.com
 
 On May 2, 2006, at 11:16 AM, Jim C. Nasby wrote:
 
  On Wed, Apr 26, 2006 at 05:14:41PM +0930, Eric Lam wrote:
  all dumpfiles total about 17Gb. It has been running for 50ish hrs  
  and up
  to about the fourth file (5-6 ish Gb) and this is on a raid 5 server.
 
  RAID5 generally doesn't bode too well for performance; that could be
  part of the issue.
  -- 
  Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
  Pervasive Software  http://pervasive.comwork: 512-231-6117
  vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461
 
  ---(end of  
  broadcast)---
  TIP 4: Have you searched our list archives?
 
 http://archives.postgresql.org
 
 
 
 
 ---(end of broadcast)---
 TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] hardare config question

2006-05-01 Thread Mark Lewis
A UPS will make it less likely that the system will reboot and destroy
your database due to a power failure, but there are other causes for a
system reboot.

With a BBU, the only component that can fail and cause catastrophic data
loss is the RAID itself.

With a UPS, you are additionally vulnerable to OS crashes, failures in
non-RAID hardware, UPS failures, or anything else that would necessitate
a hard reboot.  

So a UPS is a decent replacement for a BBU only if you trust your app
server/OS more than you value your data.

-- Mark Lewis


On Mon, 2006-05-01 at 10:58 -0700, Erik Myllymaki wrote:
 I have been in discussion with 3ware support and after adjusting some 
 settings, 
 the 3ware card in RAID 1 gets better performance than the single drive. I 
 guess 
 this had everything to do with the write (and maybe read?) cache.
 
 Of course now i am in a dangerous situation - using volatile write cache 
 without a BBU.
 
 If I were to use a UPS to ensure a soft shutdown in the event of power loss, 
 am 
 I somewhat as safe as if I were to purchase a BBU for this RAID card?
 
 
 
 Thanks.
 
 Mark Lewis wrote:
  It's also possible that the single SATA drive you were testing (or the
  controller it was attached to) is lying about fsync and performing write
  caching behind your back, whereas your new controller and drives are
  not.
  
  You'll find a lot more info on the archives of this list about it, but
  basically if your application is committing a whole lot of small
  transactions, then it will run fast (but not safely) on a drive which
  lies about fsync, but slower on a better disk subsystem which doesn't
  lie about fsync.
  
  Try running a test with fsync=off with your new equipment and if it
  suddenly starts running faster, then you know that's the problem.
  You'll either have a choice of losing all of your data the next time the
  system shuts down uncleanly but being fast, or of running slow, or of
  fixing the applications to use chunkier transactions.
  
  -- Mark
  
  On Fri, 2006-04-28 at 13:36 -0400, Vivek Khera wrote:
  On Apr 28, 2006, at 11:37 AM, Erik Myllymaki wrote:
 
  When I had this installed on a single SATA drive running from the  
  PE1800's on-board SATA interface, this operation took anywhere from  
  65-80 seconds.
 
  With my new RAID card and drives, this operation took 272 seconds!?
  switch it to RAID10 and re-try your experiment.  if that is fast,  
  then you know your raid controller does bad RAID5.
 
  anyhow, I have in one server (our office mail server and part-time  
  development testing box) an adaptec SATA RAID from dell.  it is  
  configured for RAID5 and does well for normal office stuff, but when  
  we do postgres tests on it, it just is plain old awful.
 
  but I have some LSI based cards on which RAID5 is plenty fast and  
  suitable for the DB, but those are SCSI.
 
  For what it is worth, the Dell PE1850 internal PERC4/Si card is  
  wicked fast when hooked up with a pair of U320 SCSI drives.
 
 
 
  ---(end of broadcast)---
  TIP 3: Have you checked our extensive FAQ?
 
 http://www.postgresql.org/docs/faq
  
  ---(end of broadcast)---
  TIP 9: In versions below 8.0, the planner will ignore your desire to
 choose an index scan if your joining column's datatypes do not
 match

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Hardware: HP StorageWorks MSA 1500

2006-04-20 Thread Mark Lewis
Hmmm.  We use an MSA 1000 with Fibre Channel interconnects.  No real
complaints, although I was a little bit disappointed by the RAID
controller's battery-backed write cache performance; tiny random writes
are only about 3 times as fast with write caching enabled as with it
disabled, I had (perhaps naively) hoped for more.  Sequential scans from
our main DB (on a 5-pair RAID 10 set with 15k RPM drives) get roughly
80MB/sec.

Getting the redundant RAID controllers to fail over correctly on Linux
was a big headache and required working the tech support phone all day
until we finally got to the deep guru who knew the proper undocumented
incantations.

-- Mark Lewis

On Thu, 2006-04-20 at 20:00 +0200, Mikael Carneholm wrote:
 We're going to get one for evaluation next week (equipped with dual
 2Gbit HBA:s and 2x14 disks, iirc). Anyone with experience from them,
 performance wise?
 
 Regards,
 Mikael
 
 ---(end of broadcast)---
 TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Database possible corruption , unsolvable mystery

2006-03-29 Thread Mark Lewis
Can you post an explain analyze for the delete query?  That will at
least tell you if it is the delete itself which is slow, or a trigger /
referential integrity constraint check.  Which version of PG is this?

-- Mark Lewis

On Wed, 2006-03-29 at 12:58 -0500, Eric Lauzon wrote:
 Greetings,
 
   We have an issue where we have a database with many tables.
   The layout of the database is 3 set of look alike tables with different 
 names.
   Each set of tables has some referential integrety that point back to 
 the main
   control table.
 
   On two set of tables we are able to efficiently delete referential and 
 main record
   without a problems, but on the 3rd set we have an issue where the 
 control table is clugged
   and delete seem to take forever , as example on the first two set a 
 delete of 60K record take about
   4 second to 10 second but on the 3rd set it can take as long as 3hours.
 
   This seem to be only affecting one database , the schema and way of 
 doing is replicated elsewhere
   and if efficient.
 
   The question is, even after droping 3rd set integrity , dumping the 
 table data , deleting the table,
   doing a manual checkpoint and recreating the table with the dump data , 
 with or without referential
   integrity , the problems still araise.
 
   If we copy the data from the live table and do a create table aaa as 
 select * from problematic_table;
   the table aaa operations are normal and efficient.
 
   This is why our investigation brought us to the folowing questions:
 
   1. Are postgresql data file name are hashed references to table name(as 
 oracle)? [~path to data EX:/var/log/pgsql/data/[arbitraty 
  numbers]/[datafile]]?
 
   2. If the data files are corrupted and we re-create is it possible it 
 uses the same files thus creating the same issue?
 
   3. Since we know that all the tables has that problems is there an 
 internal table with undisclosed references to tables data files?
   
   I hope the questions were clear.
 
   Have a good day, and thank you in advance.
 
 
 Eric Lauzon
 [Recherche  Développement]
 Above Sécurité / Above Security
 Tél  : (450) 430-8166
 Fax : (450) 430-1858
 
 AVERTISSEMENT CONCERNANT LA CONFIDENTIALITÉ 
 
 Le présent message est à l'usage exclusif du ou des destinataires mentionnés 
 ci-dessus. Son contenu est confidentiel et peut être assujetti au secret 
 professionnel. Si vous avez reçu le présent message par erreur, veuillez nous 
 en aviser immédiatement et le détruire en vous abstenant d'en faire une 
 copie, d'en divulguer le contenu ou d'y donner suite.
 
 CONFIDENTIALITY NOTICE
 
 This communication is intended for the exclusive use of the addressee 
 identified above. Its content is confidential and may contain privileged 
 information. If you have received this communication by error, please notify 
 the sender and delete the message without copying or disclosing it.
 
 ---(end of broadcast)---
 TIP 4: Have you searched our list archives?
 
http://archives.postgresql.org

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Postgres and Ingres R3 / SAN

2006-03-07 Thread Mark Lewis
On Tue, 2006-03-07 at 13:00 -0600, Jim C. Nasby wrote:

...

 PostgreSQL on a SAN won't buy you what I think you think it will. It's
 essentially impossible to safely run two PostgreSQL installs off the
 same data files without destroying your data. What a SAN can buy you is
 disk-level replication, but I've no experience with that.

It is possible to run two instances against the same SAN using tools
such as RedHat's Cluster Suite.  We use that in-house as a cheap
alternative for Oracle clustering, although we're not using it for our
PostgreSQL servers yet.  It's not for load balancing, just
active/passive fault tolerance.

-- Mark Lewis

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index

2006-02-16 Thread Mark Lewis
On Thu, 2006-02-16 at 12:15 -0500, Tom Lane wrote:
 Once or twice we've kicked around the idea of having some
 datatype-specific sorting code paths alongside the general-purpose one,
 but I can't honestly see this as being workable from a code maintenance
 standpoint.
 
   regards, tom lane


It seems that instead of maintaining a different sorting code path for
each data type, you could get away with one generic path and one
(hopefully faster) path if you allowed data types to optionally support
a 'sortKey' interface by providing a function f which maps inputs to 32-
bit int outputs, such that the following two properties hold:

f(a)=f(b) iff a=b
if a==b then f(a)==f(b)

So if a data type supports the sortKey interface you could perform the
sort on f(value) and only refer back to the actual element comparison
functions when two sortKeys have the same value.

Data types which could probably provide a useful function for f would be
int2, int4, oid, and possibly int8 and text (at least for SQL_ASCII).

Depending on the overhead, you might not even need to maintain 2
independent search code paths, since you could always use f(x)=0 as the
default sortKey function which would degenerate to the exact same sort
behavior in use today.

-- Mark Lewis

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index

2006-02-16 Thread Mark Lewis
On Thu, 2006-02-16 at 17:51 -0500, Greg Stark wrote:
   Data types which could probably provide a useful function for f would be
   int2, int4, oid, and possibly int8 and text (at least for SQL_ASCII).
 
 How exactly do you imagine doing this for text?
 
 I could see doing it for char(n)/varchar(n) where n=4 in SQL_ASCII though.


In SQL_ASCII, just take the first 4 characters (or 8, if using a 64-bit
sortKey as elsewhere suggested).  The sorting key doesn't need to be a
one-to-one mapping.

-- Mark Lewis

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] Reliability recommendations

2006-02-15 Thread Mark Lewis
Machine 1: $2000
Machine 2: $2000
Machine 3: $2000

Knowing how to rig them together and maintain them in a fully fault-
tolerant way: priceless.


(Sorry for the off-topic post, I couldn't resist).

-- Mark Lewis

On Wed, 2006-02-15 at 09:19 -0800, Craig A. James wrote:
 Jeremy Haile wrote:
  We are a small company looking to put together the most cost effective
  solution for our production database environment.  Currently in
  production Postgres 8.1 is running on this machine:
  
  Dell 2850
  2 x 3.0 Ghz Xeon 800Mhz FSB 2MB Cache
  4 GB DDR2 400 Mhz
  2 x 73 GB 10K SCSI RAID 1 (for xlog and OS)
  4 x 146 GB 10K SCSI RAID 10 (for postgres data)
  Perc4ei controller
  
  ... I sent our scenario to our sales team at Dell and they came back with
  all manner of SAN, DAS, and configuration costing as much as $50k.
 
 Given what you've told us, a $50K machine is not appropriate.
 
 Instead, think about a simple system with several clones of the database and 
 a load-balancing web server, even if one machine could handle your load.  If 
 a machine goes down, the load balancer automatically switches to the other.
 
 Look at the MTBF figures of two hypothetical machines:
 
  Machine 1: Costs $2,000, MTBF of 2 years, takes two days to fix on average.
  Machine 2: Costs $50,000, MTBF of 100 years (!), takes one hour to fix on 
 average.
 
 Now go out and buy three of the $2,000 machines.  Use a load-balancer front 
 end web server that can send requests round-robin fashion to a server farm. 
  Clone your database.  In fact, clone the load-balancer too so that all three 
 machines have all software and databases installed.  Call these A, B, and C 
 machines.
 
 At any given time, your Machine A is your web front end, serving requests to 
 databases on A, B and C.  If B or C goes down, no problem - the system keeps 
 running.  If A goes down, you switch the IP address of B or C and make it 
 your web front end, and you're back in business in a few minutes.
 
 Now compare the reliability -- in order for this system to be disabled, you'd 
 have to have ALL THREE computers fail at the same time.  With the MTBF and 
 repair time of two days, each machine has a 99.726% uptime.  The MTBF, that 
 is, the expected time until all three machines will fail simultaneously, is 
 well over 100,000 years!  Of course, this is silly, machines don't last that 
 long, but it illustrates the point:  Redundancy is beats reliability (which 
 is why RAID is so useful). 
 
 All for $6,000.
 
 Craig
 
 ---(end of broadcast)---
 TIP 5: don't forget to increase your free space map settings

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Query planner issue

2006-01-30 Thread Mark Lewis
You have lots of dead rows.  Do a vacuum full to get it under control,
then run VACUUM more frequently and/or increase your FSM settings to
keep dead rows in check.  In 7.2 vacuum is pretty intrusive; it will be
much better behaved once you can upgrade to a more recent version.

You really, really want to upgrade as soon as possible, and refer to the
on-line docs about what to do with your FSM settings.

-- Mark Lewis


On Mon, 2006-01-30 at 23:57 +0100, Emmanuel Lacour wrote:
 Hi everybody,
 
 I have the following problem, on a test server, if I do a fresh import
 of production data then run 
 'explain analyze select count(*) from mandats;'
 
 I get this result:
 
 Aggregate  (cost=6487.32..6487.32 rows=1 width=0) (actual time=607.61..607.61 
 rows=1 loops=1)
   -  Seq Scan on mandats  (cost=0.00..6373.26 rows=45626 width=0) (actual 
 time=0.14..496.20 rows=45626 loops=1)
   Total runtime: 607.95 msec
 
 
 On the production server, if I do the same (without other use of the server), 
 I get:
 
 Aggregate  (cost=227554.33..227554.33 rows=1 width=0) (actual 
 time=230705.79..230705.79 rows=1 loops=1)
   -  Seq Scan on mandats  (cost=0.00..227440.26 rows=45626 width=0) (actual 
 time=0.03..230616.64 rows=45760 loops=1)
   Total runtime: 230706.08 msec
 
 
 
 Is there anyone having an idea on how yo solve this poor performances? I
 think it is caused by many delete/insert on this table every day, but
 how to solve it, I need to run this qury each hour :(. I run
 vacuum each night, postgresql is unfortunatly 7.2.1 :( (no upgrade
 before 2 or 3 months).
 

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Suspending SELECTs

2006-01-17 Thread Mark Lewis
 I am developing my applications in Objective Caml, and I have written the 
 middleware layer myself. I could easily implement a cursor-pooling strategy, 
 but 
 there is no perfect solution to the problem of guaranteeing that cursors be 
 closed. Remember that web applications require the user to open a session 
 by 
 connecting the appropriate HTTP resource, but users as never required to log 
 out. Hence, in order to eventually reclaim all cursors, I must use magical 
 log-out detection algorithm, which is usually implemented with a simple 
 timeout. This guarantees the required property of safety (the population of 
 cursors does not diverge) but does not guarantee the required property of 
 liveness (a user connecting to the application, who has opened a session but 
 has 
 not logged out, and thus possesses a session token, should have access the 
 execution context identified by his token).

I fail to see the problem here.  Why should liveness be a required
property?  If is it simply that you can't promptly detect when a user is
finished with their web session so you can free resources, then remember
that there is no requirement that you dedicate a connection to their
session in the first place.  Even if you're using your own custom
middleware, it isn't a very complicated or conceptually difficult thing
to implement (see my previous post).  Certainly it's simpler than
allowing clients to pass around runtime state.

As far as implementing this sort of thing in the back-end, it would be
really hard with the PostgreSQL versioning model.  Oracle can more
easily (and kind of does) support cursors like you've described because
they implement MVCC differently than PostgreSQL, and in their
implementation you're guaranteed that you always have access to the most
recent x megabytes of historical rows, so even without an open
transaction to keep the required rows around you can still be relatively
sure they'll be around for long enough.  In PostgreSQL, historical
rows are kept in the tables themselves and periodically vacuumed, so
there is no such guarantee, which means that you would need to either
implement a lot of complex locking for little material gain, or just
hold the cursors in moderately long-running transactions, which leads
back to the solution suggested earlier.

-- Mark Lewis



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Suspending SELECTs

2006-01-16 Thread Mark Lewis
On Mon, 2006-01-16 at 11:13 +0100, Alessandro Baretta wrote:
 I am aware that what I am dreaming of is already available through cursors, 
 but 
 in a web application, cursors are bad boys, and should be avoided. What I 
 would 
 like to be able to do is to plan a query and run the plan to retreive a 
 limited 
 number of rows as well as the executor's state. This way, the burden of 
 maintaining the cursor on hold, between activations of the web resource 
 which 
 uses it, is transferred from the DBMS to the web application server, and, 
 most 
 importantly, the responsibility for garbage-collecting stale cursors is 
 implicitely delegated to the garbage-collector of active user sessions. 
 Without 
 this mechanism, we are left with two equally unpleasant solutions: first, any 
 time a user instantiates a new session, a new cursor would have to be 
 declared 
 for all relevant queries, and an ad-hoc garbage collection daemon would have 
 to 
 be written to periodically scan the database for stale cursors to be closed; 
 otherwise, instead of using cursors, the web application could resort to 
 OFFSET-LIMIT queries--no garbage collection issues but pathetic performance 
 and 
 server-load.
 
 Do we have any way out?
 
 Alex

I know that Tom has pretty much ruled out any persistent cursor
implementation in the database, but here's an idea for a workaround in
the app:

Have a pool of connections used for these queries.  When a user runs a
query the first time, create a cursor and remember that this user
session is associated with that particular connection.  When the user
tries to view the next page of results, request that particular
connection from the pool and continue to use the cursor.  Between
requests, this connection could of course be used to service other
users.

This avoids the awfulness of tying up a connection for the entire course
of a user session, but still allows you to use cursors for
performance.  

When a user session is invalidated or times out, you remove the mapping
for this connection and close the cursor.  Whenever there are no more
mappings for a particular connection, you can use the opportunity to
close the current transaction (to prevent eternal transactions).

If the site is at all busy, you will need to implement a pooling policy
such as 'do not open new cursors on the connection with the oldest
transaction', which will ensure that all transactions can be closed in a
finite amount of time, the upper bound on the duration of a transaction
is (longest_session_duration * connections in pool).

Limitations:

1. You shouldn't do anything that acquires write locks on the database
using these connections, because the transactions will be long-running.
To mitigate this, use a separate connection pool.

2. Doesn't work well if some queries take a long time to run, because
other users may need to wait for the connection, and another connection
won't do.

3. If this is a busy web site, you might end up with potentially many
thousands of open cursors.  I don't know if this introduces an
unacceptable performance penalty or other bottleneck in the server?

-- Mark Lewis

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] Extremely irregular query performance

2006-01-11 Thread Mark Lewis
If this is a query that will be executed more than once, you can also
avoid incurring the planning overhead multiple times by using PREPARE.

-- Mark Lewis

On Wed, 2006-01-11 at 18:50 -0500, Jean-Philippe Côté wrote:
 Thanks a lot for this info, I was indeed exceeding the genetic
 optimizer's threshold.  Now that it is turned off, I get
 a very stable response time of 435ms (more or less 5ms) for
 the same query. It is about three times slower than the best
 I got with the genetic optimizer on, but the overall average
 is much lower.
 
 I'll also try to play with the geqo parameters and see if things
 improve.
 
 Thanks again,
 
 J-P


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] help tuning queries on large database

2006-01-10 Thread Mark Lewis
Ron,

A few days back you mentioned:

 Upgrade your kernel to at least  2.6.12
 There's a known issue with earlier versions of the 2.6.x kernel and 
 64b CPUs like the Opteron.  See kernel.org for details.
 

I did some searching and couldn't find any obvious mention of this issue
(I gave up after searching through the first few hundred instances of
64 in the 2.6.12 changelog).

Would you mind being a little more specific about which issue you're
talking about?  We're about to deploy some new 16GB RAM Opteron DB
servers and I'd like to check and make sure RH backported whatever the
fix was to their current RHEL4 kernel.

Thanks,
Mark Lewis

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] browsing table with 2 million records

2005-10-26 Thread Mark Lewis
Do you have an index on the date column?  Can you post an EXPLAIN
ANALYZE for the slow query?

-- Mark Lewis

On Wed, 2005-10-26 at 13:41 -0700, aurora wrote:
 I am running Postgre 7.4 on FreeBSD. The main table have 2 million
 record (we would like to do at least 10 mil or more). It is mainly a
 FIFO structure with maybe 200,000 new records coming in each day that
 displace the older records.
 
 We have a GUI that let user browser through the record page by page at
 about 25 records a time. (Don't ask me why but we have to have this
 GUI). This translates to something like
 
   select count(*) from table   -- to give feedback about the DB size
   select * from table order by date limit 25 offset 0
 
 Tables seems properly indexed, with vacuum and analyze ran regularly.
 Still this very basic SQLs takes up to a minute run.
 
 I read some recent messages that select count(*) would need a table
 scan for Postgre. That's disappointing. But I can accept an
 approximation if there are some way to do so. But how can I optimize
 select * from table order by date limit x offset y? One minute
 response time is not acceptable.
 
 Any help would be appriciated.
 
 Wy
 
 


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Is There Any Way ....

2005-10-04 Thread Mark Lewis
Which version of PG are you using?  One of the new features for 8.0 was
an improved caching algorithm that was smart enough to avoid letting a
single big query sweep everything else out of cache.

-- Mark Lewis


On Tue, 2005-10-04 at 10:45 -0400, Lane Van Ingen wrote:
 Yes, Stefan, the kind of usage you are mentioning is exactly why I was
 asking.
 
 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] Behalf Of Stefan Weiss
 Sent: Tuesday, October 04, 2005 6:32 AM
 To: pgsql-performance@postgresql.org
 Subject: Re: [PERFORM] Is There Any Way 
 
 
 On 2005-09-30 01:21, Lane Van Ingen wrote:
(3) Assure that a disk-based table is always in memory (outside of
 keeping
  it in
memory buffers as a result of frequent activity which would prevent
  LRU
operations from taking it out) ?
 
 I was wondering about this too. IMO it would be useful to have a way to tell
 PG that some tables were needed frequently, and should be cached if
 possible. This would allow application developers to consider joins with
 these tables as cheap, even when querying on columns that are not indexed.
 I'm thinking about smallish tables like users, groups, *types, etc which
 would be needed every 2-3 queries, but might be swept out of RAM by one
 large query in between. Keeping a table like users on a RAM fs would not
 be an option, because the information is not volatile.
 
 
 cheers,
 stefan
 
 ---(end of broadcast)---
 TIP 5: don't forget to increase your free space map settings
 
 
 
 ---(end of broadcast)---
 TIP 6: explain analyze is your friend


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] Releasing memory during External sorting?

2005-09-23 Thread Mark Lewis
operations != passes.  If you were clever, you could probably write a
modified bubble-sort algorithm that only made 2 passes.  A pass is a
disk scan, operations are then performed (hopefully in memory) on what
you read from the disk.  So there's no theoretical log N lower-bound on
the number of disk passes.

Not that I have anything else useful to add to this discussion, just a
tidbit I remembered from my CS classes back in college :)

-- Mark

On Fri, 2005-09-23 at 13:17 -0400, Tom Lane wrote:
 Ron Peacetree [EMAIL PROTECTED] writes:
  2= No optimal external sorting algorithm should use more than 2 passes.
  3= Optimal external sorting algorithms should use 1 pass if at all possible.
 
 A comparison-based sort must use at least N log N operations, so it
 would appear to me that if you haven't got approximately log N passes
 then your algorithm doesn't work.
 
   regards, tom lane
 
 ---(end of broadcast)---
 TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Poor performance of delete by primary key

2005-09-06 Thread Mark Lewis
I had a similar problem, so I downloaded 8.1 from CVS, ran it on a
relatively gnarly dev workstation, imported a dump of my 8.0 database,
and ran my troublesome queries with the new EXPLAIN ANALYZE.

This process took about an hour and worked great, provided that you've
actually named your foreign key constraints.  Otherwise, you'll find out
that there's a trigger for a constraint called $3 that's taking up all
of your time, but you won't know what table that constraint is on.

-- Mark



On Tue, 2005-09-06 at 11:32 -0400, Tom Lane wrote:
 Brian Choate [EMAIL PROTECTED] writes:
  We are seeing a very strange behavior from postgres. For one of our very =
  common tasks we have to delete records from a table of around 500,000 =
  rows. The delete is by id which is the primary key. It seems to be =
  consistently taking around 10 minutes to preform. This is totally out of =
  line with the rest of the performance of the database.
 
 I'll bet this table has foreign-key references from elsewhere, and the
 referencing columns are either not indexed, or not of the same datatype
 as the master column.
 
 Unfortunately there's no very simple way to determine which FK is the
 problem.  (In 8.1 it'll be possible to do that with EXPLAIN ANALYZE,
 but in existing releases EXPLAIN doesn't break out the time spent in
 each trigger ...)  You have to just eyeball the schema :-(.
 
   regards, tom lane
 
 ---(end of broadcast)---
 TIP 3: Have you checked our extensive FAQ?
 
http://www.postgresql.org/docs/faq


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] PG8 Tuning

2005-08-11 Thread Mark Lewis
(Musing, trying to think of a general-purpose performance-tuning rule
that applies here):

Actually, it seems to me that with the addition of the WAL in PostgreSQL
and the subsequent decreased need to fsync the data files themselves
(only during checkpoints?), that the only time a battery-backed write
cache would make a really large performance difference would be on the
drive(s) hosting the WAL.

So although it is in general good to have a dedicated spindle for the
WAL, for many workloads it is in fact significantly better to have the
WAL written to a battery-backed write cache.  The exception would be for
applications with fewer, larger transactions, in which case you could
actually use the dedicated spindle.

Hmmm, on second thought, now I think I understand the rationale behind
having a non-zero commit delay setting-- the problem with putting
pg_xlog on a single disk without a write cache is that frequent fsync()
calls might cause it to spend most of its time seeking instead of
writing (as seems to be happening to Paul here).  Then again, the OS IO
scheduler should take care of this for you, making this a non-issue.
Perhaps Solaris 10 just has really poor IO scheduling performance with
this particular hardware and workload?

Ah well.  Thought myself in circles and have no real conclusions to show
for it.  Posting anyway, maybe this will give somebody some ideas to
work with.

-- Mark Lewis

On Fri, 2005-08-12 at 08:47 +, Steve Poe wrote:
 Paul,
 
 Before I say anything else, one online document which may be of
 assistance to you is:
 http://www.powerpostgresql.com/PerfList/
 
 Some thoughts I have:
 
 3) You're shared RAM setting seems overkill to me. Part of the challenge
 is you're going from 1000 to 262K with no assessment in between. Each
 situation can be different, but try in the range of 10 - 50K.
 
 4) pg_xlog: If you're pg_xlog is on a spindle is *only* for pg_xlog
 you're better off. If it is sharing with any other OS/DB resource, the
 performance will be impacted.
 
 From what I have learned from others on this list, RAID5 is not the best
 choice for the database. RAID10 would be a better solution (using 8 of
 your disks) then take the remaining disk and do mirror with your pg_xlog
 if possible.
 
 Best of luck,
 
 Steve Poe
 
 On Thu, 2005-08-11 at 13:23 +0100, Paul Johnson wrote:
  Hi all, we're running PG8 on a Sun V250 with 8GB RAM and 2*1.3GHz SPARC
  CPUs running Solaris 10. The DB cluster is on an external fibre-attached
  Sun T3 array that has 9*36GB drives configured as a single RAID5 LUN.
  
  The system is for the sole use of a couple of data warehouse developers,
  hence we are keen to use 'aggressive' tuning options to maximise
  performance.
  
  So far we have made the following changes and measured the impact on our
  test suite:
  
  1) Increase checkpoint_segments from 3 to 64. This made a 10x improvement
  in some cases.
  
  2) Increase work_mem from 1,024 to 524,288.
  
  3) Increase shared_buffers from 1,000 to 262,143 (2 GB). This required
  setting SHMMAX=4294967295 (4 GB) in /etc/system and re-booting the box.
  
  Question - can Postgres only use 2GB RAM, given that shared_buffers can
  only be set as high as 262,143 (8K pages)?
  
  So far so good...
  
  4) Move /pg_xlog to an internal disk within the V250. This has had a
  severe *negative* impact on performance. Copy job has gone from 2 mins to
  12 mins, simple SQL job gone from 1 min to 7 mins. Not even run long SQL
  jobs.
  
  I'm guessing that this is because pg_xlog has gone from a 9 spindle LUN to
  a single spindle disk?
  
  In cases such as this, where an external storage array with a hardware
  RAID controller is used, the normal advice to separate the data from the
  pg_xlog  seems to come unstuck, or are we missing something?
  
  Cheers,
  
  Paul Johnson.
  
  ---(end of broadcast)---
  TIP 4: Have you searched our list archives?
  
 http://archives.postgresql.org
 
 
 ---(end of broadcast)---
 TIP 3: Have you checked our extensive FAQ?
 
http://www.postgresql.org/docs/faq


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] faster INSERT with possible pre-existing row?

2005-07-26 Thread Mark Lewis
Easier and faster than doing the custom trigger is to simply define a
unique index and let the DB enforce the constraint with an index lookup,
something like:

create unique index happy_index ON happy_table(col1, col2, col3);

That should run faster than the custom trigger, but not as fast as the
temp table solution suggested elsewhere because it will need to do an
index lookup for each row.  With this solution, it is important that
your shared_buffers are set high enough that the happy_index can be kept
in memory, otherwise performance will drop precipitously.  Also, if you
are increasing the size of the table by a large percentage, you will
want to ANALYZE periodically, as an optimal plan for a small table may
be a disaster for a large table, and PostgreSQL won't switch plans
unless you run ANALYZE.

-- Mark

On Tue, 2005-07-26 at 14:51 -0500, John A Meinel wrote:
 Matthew Nuzum wrote:
  On 7/26/05, Dan Harris [EMAIL PROTECTED] wrote:
  
 I am working on a process that will be inserting tens of million rows
 and need this to be as quick as possible.
 
 The catch is that for each row I could potentially insert, I need to
 look and see if the relationship is already there  to prevent
 multiple entries.  Currently I am doing a SELECT before doing the
 INSERT, but I recognize the speed penalty in doing to operations.  I
 wonder if there is some way I can say insert this record, only if it
 doesn't exist already.  To see if it exists, I would need to compare
 3 fields instead of just enforcing a primary key.
  
  
  I struggled with this for a while. At first I tried stored procedures
  and triggers, but it took very long (over 24 hours for my dataset).
  After several iterations of rewritting it, first into C# then into
  Python I got the whole process down to under 30 min.
  
  My scenario is this:
  I want to normalize log data. For example, for the IP address in a log
  entry, I need to look up the unique id of the IP address, or if the IP
  address is new, insert it and then return the newly created entry.
  Multiple processes use the data, but only one process, run daily,
  actually changes it. Because this one process knows that the data is
  static, it selects the tables into in-memory hash tables (C#) or
  Dictionaries (Python) and then does the lookups there. It is *super*
  fast, but it uses a *lot* of ram. ;-)
  
  To limit the ram, I wrote a version of the python code that uses gdbm
  files instead of Dictionaries. This requires a newer version of Python
  (to allow a gdbm db to work just like a dictionary) but makes life
  easier in case someone is using my software on a lower end machine.
  This doubled the time of the lookups from about 15 minutes to 30,
  bringing the whole process to about 45 minutes.
  
 
 Did you ever try the temp table approach? You could:
 
 COPY all records into temp_table, with an empty row for ip_id
 -- Get any entries which already exist
 UPDATE temp_table SET ip_id =
   (SELECT ip_id from ipaddress WHERE add=add)
   WHERE EXISTS (SELECT ip_id FROM ipaddress WHERE add=add);
 -- Create new entries
 INSERT INTO ipaddress(add) SELECT add FROM temp_table
WHERE ip_id IS NULL;
 -- Update the rest
 UPDATE temp_table SET ip_id =
   (SELECT ip_id from ipaddress WHERE add=add)
   WHERE ip_id IS NULL AND
   EXISTS (SELECT ip_id FROM ipaddress WHERE add=add);
 
 This would let the database do all of the updating work in bulk on it's 
 side, rather than you pulling all the data out and doing it locally.
 
 An alternative would be something like:
 
 CREATE TEMP TABLE new_ids (address text, ip_id int);
 COPY all potentially new addresses into that table.
 -- Delete all entries which already exist
 DELETE FROM new_ids WHERE EXISTS
   (SELECT ip_id FROM ipaddresses
 WHERE add=new_ids.address);
 -- Now create the new entries
 INSERT INTO ipaddresses(add) SELECT address FROM new_ids;
 
 -- At this point you are guaranteed to have all addresses existing in
 -- the database
 
 If you then insert your full data into the final table, only leave the 
 ip_id column as null. Then if you have a partial index where ip_id is 
 NULL, you could use the command:
 
 UPDATE final_table SET ip_id =
   (SELECT ip_id FROM ipaddresses WHERE add=final_table.add)
 WHERE ip_id IS NULL;
 
 You could also do this in a temporary table, before bulk inserting into 
 the final table.
 
 I don't know what you have tried, but I know that for Dan, he easily has 
   36M rows. So I don't think he wants to pull that locally and create a 
 in-memory hash just to insert 100 rows or so.
 
 Also, for your situation, if you do keep a local cache, you could 
 certainly save the cache between runs, and use a temp table to determine 
 what new ids you need to add to it. Then you wouldn't have to pull the 
 complete set each time. You just pull new values for entries you haven't 
 added yet.
 
 John
 =:-


---(end of broadcast)---
TIP 5: don't 

Re: [PERFORM] Need help to decide Mysql vs Postgres

2005-07-08 Thread Mark Lewis
On Fri, 2005-07-08 at 16:43 +0200, Enrico Weigelt wrote:
 * PFC [EMAIL PROTECTED] wrote:
 
 snip
  For Python it's the reverse : the MySQL driver is slow and dumb, 
  and the  postgres driver (psycopg 2) is super fast, handles all 
  quoting, 
  and knows  about type conversions, it will automatically convert a 
  Python List into a  postgres Array and do the right thing with quoting, 
  and it works both ways  (ie you select a TEXT[] you get a list of 
  strings all parsed for you). It  knows about all the postgres types (yes 
  even numeric = python Decimal)  and you can even add your own types. 
  That's really cool, plus the  developer is a friendly guy.
 
 Is there anything similar for java ?
 

The postgres JDBC driver is very good-- refer to pgsql-jdbc mailing list
or look at jdbc.postgresql.org.  I've had only limited experience with
the mysql jdbc driver, but it seemed servicable enough, if you can live
with their licensing and feature set.




---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Needed: Simplified guide to optimal memory

2005-06-16 Thread Mark Lewis
We run the RPM's for RH 7.3 on our 7.2 install base with no problems.
RPM's as recent as for PostgreSQL 7.4.2 are available here:
ftp://ftp10.us.postgresql.org/pub/postgresql/binary/v7.4.2/redhat/redhat-7.3/

Or you can always compile from source.  There isn't any such thing as a
'supported' package for RH7.2 anyway.

-- Mark Lewis


On Thu, 2005-06-16 at 07:46 -0700, Todd Landfried wrote:
 Yes, it is 7.2. Why? because an older version of our software runs on  
 RH7.3 and that was the latest supported release of Postgresql for  
 RH7.3 (that we can find). We're currently ported to 8, but we still  
 have a large installed base with the other version.
 
 
 On Jun 15, 2005, at 7:18 AM, Tom Lane wrote:
 
  Dennis Bjorklund [EMAIL PROTECTED] writes:
 
  On Wed, 15 Jun 2005, Todd Landfried wrote:
 
  NOTICE:  shared_buffers is 256
 
 
 
  This looks like it's way too low. Try something like 2048.
 
 
  It also is evidently PG 7.2 or before; SHOW's output hasn't looked  
  like
  that in years.  Try a more recent release --- there's usually  
  nontrivial
  performance improvements in each major release.
 
  regards, tom lane
 
  ---(end of  
  broadcast)---
  TIP 2: you can get off all lists at once with the unregister command
  (send unregister YourEmailAddressHere to  
  [EMAIL PROTECTED])
 
 
 
 ---(end of broadcast)---
 TIP 9: the planner will ignore your desire to choose an index scan if your
   joining column's datatypes do not match


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] [GENERAL] Hash index vs. b-tree index (PostgreSQL

2005-05-10 Thread Mark Lewis
If the original paper was published in 1984, then it's been more than 20
years.  Any potential patents would already have expired, no?

-- Mark Lewis

On Tue, 2005-05-10 at 14:35, Mischa Sandberg wrote:
 Quoting Jim C. Nasby [EMAIL PROTECTED]:
 
  Well, in a hash-join right now you normally end up feeding at least
  one
  side of the join with a seqscan. Wouldn't it speed things up
  considerably if you could look up hashes in the hash index instead?
 
 You might want to google on grace hash and hybrid hash.
 
 The PG hash join is the simplest possible: build a hash table in memory,
 and match an input stream against it.
 
 *Hybrid hash* is where you spill the hash to disk in a well-designed
 way. Instead of thinking of it as building a hash table in memory, think
 of it as partitioning one input; if some or all of it fits in memory,
 all the better. The boundary condition is the same. 
 
 The real wizard of hybrid hash has to be Goetz Graefe, who sadly has now
 joined the MS Borg. He demonstrated that for entire-table joins, hybrid
 hash completely dominates sort-merge. MSSQL now uses what he developed
 as an academic, but I don't know what the patent state is.
 
 Grace hash is the original implementation of hybrid hash:
   Kitsuregawa, M., Tanaka, H., and Moto-oka, T. (1984).
   Architecture and Performance of Relational Algebra Machine Grace. 
 
 
 
 ---(end of broadcast)---
 TIP 3: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] [NOVICE] Many connections lingering

2005-04-13 Thread Mark Lewis
If there are potentially hundreds of clients at a time, then you may be
running into the maximum connection limit.

In postgresql.conf, there is a max_connections setting which IIRC
defaults to 100.  If you try to open more concurrent connections to the
backend than that, you will get a connection refused.

If your DB is fairly gnarly and your performance needs are minimal it
should be safe to increase max_connections.  An alternative approach
would be to add some kind of database broker program.  Instead of each
agent connecting directly to the database, they could pass their data to
a broker, which could then implement connection pooling.

-- Mark Lewis

On Tue, 2005-04-12 at 22:09, Slavisa Garic wrote:
 This is a serious problem for me as there are multiple users using our
 software on our server and I would want to avoid having connections
 open for a long time. In the scenario mentioned below I haven't
 explained the magnitute of the communications happening between Agents
 and DBServer. There could possibly be 100 or more Agents per
 experiment, per user running on remote machines at the same time,
 hence we need short transactions/pgsql connections. Agents need a
 reliable connection because failure to connect could mean a loss of
 computation results that were gathered over long periods of time.



---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] Delete query takes exorbitant amount of time

2005-03-28 Thread Mark Lewis
Tom Lane Wrote:
 Hmm ... I wonder how hard it would be to teach EXPLAIN ANALYZE to show
 the runtime expended in each trigger when the statement is of a kind
 that has triggers.  We couldn't break down the time *within* the
 triggers, but even this info would help a lot in terms of finger
 pointing ...
 
   Seq Scan on ... (nn.nnn ms)
   Trigger foo: nn.mmm ms
   Trigger bar: nn.mmm ms
   Total time: nn.mmm ms


So I got the latest from CVS on Friday night to see how hard it would be
to implement this, but it turns out that Tom has already committed the
improvement, so I'm in Tom's fan club today.  I imported my test dataset
and was almost immediately able to track down the cause of my
performance problem.

Thanks!
Mark Lewis


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Delete query takes exorbitant amount of time

2005-03-24 Thread Mark Lewis
Tom,

I've got a similar problem with deletes taking a very long time.  I know
that there are lots of foreign keys referencing this table, and other
foreign keys referencing those tables, etc.  I've been curious, is there
a way to find out how long the foreign key checks take for each
dependent table?

-- Mark Lewis

On Thu, 2005-03-24 at 16:52, Tom Lane wrote:
 Karim Nassar [EMAIL PROTECTED] writes:
  Here is the statement:
 
  orfs=# explain analyze DELETE FROM int_sensor_meas_type WHERE
  id_meas_type IN (SELECT * FROM meas_type_ids);
   QUERY PLAN 
  -
   Hash Join  (cost=11.53..42.06 rows=200 width=6) (actual
  time=1.564..2.840 rows=552 loops=1)
  ...
   Total runtime: 2499616.216 ms
  (7 rows)
 
 Notice that the actual join is taking 2.8 ms.  The other ~40 minutes is
 in operations that we cannot see in this plan, but we can surmise are ON
 DELETE triggers.
 
  Where do I go from here?
 
 Look at what your triggers are doing.  My private bet is that you have
 unindexed foreign keys referencing this table, and so each deletion
 forces a seqscan of some other, evidently very large, table(s).
 
   regards, tom lane
 
 ---(end of broadcast)---
 TIP 7: don't forget to increase your free space map settings


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match