Re: [PERFORM] Planner should use index on a LIKE 'foo%' query

2008-06-28 Thread Steinar H. Gunderson
On Sat, Jun 28, 2008 at 06:24:42PM +0200, Moritz Onken wrote:
  SELECT distinct url from item where url like 'http://www.micro%' limit 
 10;

Here, the planner knows the pattern beforehand, and can see that it's a
simple prefix.
 select *
  from result
  where exists
(select * from item where item.url LIKE result.url || '%' limit 1)  
 limit 10;

Here it cannot (what if result.url was '%foo%'?).

Try using something like (item.url = result.url  item.url = result.url ||
'z'), substituting an appropriately high character for 'z'.

 The only explaination is that I don't use a constant when comparing the 
 values. But actually it is a constant...

It's not a constant at planning time.

Also note that you'd usually want to use IN instead of a WHERE EXISTS.

/* Steinar */
-- 
Homepage: http://www.sesse.net/

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Cursors and different settings for default_statistics_target

2008-04-01 Thread Steinar H. Gunderson
On Tue, Apr 01, 2008 at 12:42:03PM -0400, Tom Lane wrote:
 That's CURSOR_OPT_FAST_PLAN and isn't it? Our application reads the full
 results of most cursors.
 Just out of curiosity, why use a cursor at all then?

This isn't the same scenario as the OP, but I've used a cursor in cases where
I cannot keep all of the dataset in memory at the client at once, but I _can_
coerce it down to a more manageable size as it comes in.

I don't know if a cursor is the only way to do this (short of making a custom
function inside Postgres of some sort), but it seems to be the simplest way
in libpqxx, at least.

/* Steinar */
-- 
Homepage: http://www.sesse.net/

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] migration of 7.4 to 8.1

2008-03-12 Thread Steinar H. Gunderson
On Wed, Mar 12, 2008 at 12:13:01PM +0530, sathiya psql wrote:
 My question is that how to migrate my database to 7.4 to 8.1

  aptitude install postgresql-8.1
  pg_dropcluster 8.1 main
  pg_upgradecluster 7.4 main

/* Steinar */
-- 
Homepage: http://www.sesse.net/

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] postgresql performance

2008-03-05 Thread Steinar H. Gunderson
On Wed, Mar 05, 2008 at 02:27:08AM -0800, SPMLINGAM wrote:
  I have a table with 50 lakhs records, the table has more then 10
 fields, i have primary key, i have select query with count(*) without any
 condition, it takes 17 seconds.

Without knowing what a lakhs record is, it's pretty obvious that you
haven't vacuumed in a very long time. Run VACUUM FULL on your tables, then
instate regular (non-FULL) VACUUMs or enable autovacuum.

/* Steinar */
-- 
Homepage: http://www.sesse.net/

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://mail.postgresql.org/mj/mj_wwwusr?domain=postgresql.orgextra=pgsql-performance


Re: [PERFORM] Optimisation help

2008-03-04 Thread Steinar H. Gunderson
On Wed, Mar 05, 2008 at 12:15:25AM +, dforums wrote:
 In regards of update, I have around 1 updates while a laps of 10 minutes

 Is there a settings to optimise updates ?

If you can, batch them into a single transaction.

If you can, upgrade to 8.3. HOT might help you here.

/* Steinar */
-- 
Homepage: http://www.sesse.net/

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your Subscription:
http://mail.postgresql.org/mj/mj_wwwusr?domain=postgresql.orgextra=pgsql-performance


Re: [PERFORM] Making the most of memory?

2008-01-24 Thread Steinar H. Gunderson
On Wed, Jan 23, 2008 at 07:54:24PM -0500, Greg Smith wrote:
 (a standard mechanical drive under heavy write load also wears out faster
 than one doing less work).

Wasn't this one of the myths that was dispelled in the Google disk paper a
while ago?

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] [OT] RAID controllers blocking one another?

2008-01-17 Thread Steinar H. Gunderson
On Thu, Jan 17, 2008 at 03:07:02PM -0600, Scott Marlowe wrote:
 Sounds like they're sharing something they shouldn't be.  I'm not real
 familiar with PCI-express.  Aren't those the ones that use up to 16
 channels for I/O?  Can you divide it to 8 and 8 for each PCI-express
 slot in the BIOS maybe, or something like that?

PCI-E is a point-to-point-system.

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [PERFORM] SELECT * FROM table is too slow

2007-12-16 Thread Steinar H. Gunderson
On Sun, Dec 16, 2007 at 07:34:45PM +0100, Adam PAPAI wrote:
 Why does it take cca 18-20 sec to get the results?
 Too many indexes?

You cannot possibly have VACUUMed in a long time. Try a VACUUM FULL, and then
schedule regular VACUUMs (or use autovacuum).

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Limited performance on multi core server

2007-12-12 Thread Steinar H. Gunderson
On Wed, Dec 12, 2007 at 10:16:43AM +, Matthew Lunnon wrote:
 Does anyone have any ideas what my bottle neck might be and what I can do 
 about it?

Your bottleneck is that you are using a very old version of PostgreSQL. Try
8.2 or (if you can) the 8.3 beta series -- it scales a _lot_ better in this
kind of situation.

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [PERFORM] GiST indexing tuples

2007-11-29 Thread Steinar H. Gunderson
On Thu, Nov 29, 2007 at 03:23:10PM -0500, Matthew T. O'Connor wrote:
 Sorry in advance if this is a stupid question, but how is this better than 
 two index, one on a and one on b?  I supposed there could be a space 
 savings but beyond that?

You could index on both columns simultaneously without a bitmap index scan.

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Query only slow on first run

2007-11-28 Thread Steinar H. Gunderson
On Wed, Nov 28, 2007 at 09:16:08PM +0100, cluster wrote:
 Hmm, actually I still don't understand why it takes 6400 ms to fetch the 
 rows. As far as I can see the index used is covering so that real row 
 lookups shouldn't be necessary.

The indexes don't contain visibility information, so Postgres has to look up
the row on disk to verify it isn't dead.

 Also, only the the random_numbers induces by questions with status = 1
 should be considered - and this part is a relatively small subset.

Again, you'll need to have a combined index if you want this to help you any.

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] GiST indexing tuples

2007-11-27 Thread Steinar H. Gunderson
On Tue, Nov 27, 2007 at 06:28:23PM +, Matthew wrote:
 SELECT * FROM table WHERE a  1 AND b  4;

This sounds like something an R-tree can do.

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Query only slow on first run

2007-11-27 Thread Steinar H. Gunderson
On Tue, Nov 27, 2007 at 11:51:40PM +0100, cluster wrote:
 A thing that strikes me is the following. As you can see I have the 
 constraint: q.status = 1. Only a small subset of the data set has this 
 status. I have an index on q.status but for some reason this is not used. 
 Instead the constraint are ensured with a Filter: (q.status = 1) in an 
 index scan for the primary key in the q table. If the small subset having 
 q.status = 1 could be isolated quickly using an index, I would expect the 
 query to perform better. I just don't know why the planner doesn't use the 
 index on q.status.

An index scan (as opposed to a bitmap index scan) can only use one index at a
time, so it will choose the most selective one. Here it quite correctly
recognizes that there will only be one matching record for the given
question_id, so it uses the primary key instead.

You could make an index on (question_id,status) (or a partial index on
question id, with status=1 as the filter), but I'm not sure how much it would
help you unless the questions table is extremely big. It doesn't appear to
be; in fact, it appears to be all in RAM, so that's not your bottleneck.

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Query only slow on first run

2007-11-27 Thread Steinar H. Gunderson
On Tue, Nov 27, 2007 at 07:25:54PM -0500, Tom Lane wrote:
 You could make an index on (question_id,status) (or a partial index on
 question id, with status=1 as the filter), but I'm not sure how much it would
 help you unless the questions table is extremely big. It doesn't appear to
 be; in fact, it appears to be all in RAM, so that's not your bottleneck.
 Wouldn't help, because the accesses to questions are not the problem.

Yes, that was my point too. :-)

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] PostgreSQL vs MySQL, and FreeBSD

2007-11-11 Thread Steinar H. Gunderson
On Sun, Nov 11, 2007 at 08:27:02PM +0100, Dimitri wrote:
 As well to demonstrate OS parallelism it's not so impressive to see
 4CPU server results rather 8CPU or 32threaded Niagara... Don't know
 why they did not present similar performance graphs for these
 platform, strange no?...

I guess it's because their Niagara support is still very raw, and besides,
it's not a very common platform.

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Join performance

2007-11-08 Thread Steinar H. Gunderson
On Thu, Nov 08, 2007 at 04:47:09PM -0600, Pepe Barbe wrote:
 I am having an issue on PostgreSQL 8.0.12. In the past we had performance 
 issues with the query planner for queries on some tables where we knew we 
 had indexes and it was doing a sequential scan, and for this reason we 
 issue SET enable_seqscan = FALSE for some queries.

This is a bad idea in general. Did you really measure that this made queries
run faster? Generally, using an index is not always a win, and the planner
tries to figure out when it isn't. Setting it globally is seldom a good idea
anyway; if it really _is_ a win for a given query, you could always set it
locally in that session.

 Any ideas what could I try to fix this problem?

Re-enable seqscan?

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] Migrating to 8.3 - checkpoints and background writer

2007-11-04 Thread Steinar H. Gunderson
On Sun, Nov 04, 2007 at 07:33:46PM -0500, Greg Smith wrote:
 On the topic of performance improvements in 8.3, I don't think this list 
 has been getting information about the concurrent sequential scans 
 improvements.  Check out these documents for more about that:

 http://j-davis.com/postgresql/83v82_scans.html

That's a nice writeup. I'm a bit puzzled by this part, though: All tests
are on linux with the anticipatory I/O scheduler. The default I/O scheduler
for Linux is CFQ (Completely Fair Queue), which does not work well for
PostgreSQL in my tests.

All earlier benchmarks I've seen (even specifically for Postgres) have said
that cfq  deadline  anticipatory for database work. How large was the
difference for you?

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] two queryes in a single tablescan

2007-10-17 Thread Steinar H. Gunderson
On Wed, Oct 17, 2007 at 02:30:52PM +0200, Stefano Dal Pra wrote:
 The main goal would be to get multiple results while scanning the
 table[s] once only
 thus getting results in a faster  way.

In 8.3, Postgres will do this for you itself -- if you already have a
sequential scan running against a given table, another one starting in
parallel will simply piggyback it.

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] REPOST: Nested loops row estimates always too high

2007-09-25 Thread Steinar H. Gunderson
On Tue, Sep 25, 2007 at 12:53:55AM -0400, Carlo Stonebanks wrote:
 My problem is that I think that SRF's are causing my problems. The SRF's
 gets an automatic row estimate of 1000 rows. Add a condition to it, the
 planner guesses 333 rows. Even at 333, this is an overestimate of the number
 of rows returned.
 
 I'm really disappointed - SRF's are a great way to place the enterprise's
 db-centric business logic at the server.

Fortunately, in 8.3 you can attach a row estimate to the function yourself,
which should most likely fix your problem. Look forward to the first beta :-)

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] DRBD and Postgres: how to improve the perfomance?

2007-09-11 Thread Steinar H. Gunderson
On Tue, Sep 11, 2007 at 04:47:40PM +0200, Maila Fatticcioni wrote:
 The servers are cross connected with a common 100 Mbit/sec Ethernet so I
 think they have a bandwidth around 80 Mbit/sec (even if I haven't yet
 done any test on it). A rate of 70Mb seems reasonable to me.

Umm, seriously? Unless that was a typo, you should consider very seriously to
go to gigabit; it's cheap these days, and should provide you with a very
decent speed boost if the network bandwidth is the bottleneck.

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Transaction Log

2007-08-29 Thread Steinar H. Gunderson
On Wed, Aug 29, 2007 at 01:11:32PM -0700, Steve Atkins wrote:
 It think the general conclusion was When they come out with an ECC 
 version, we'll look at them.

FWIW, it shouldn't be impossible to implement ECC in software; they'd still
be orders of magnitude faster than normal disks.

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [PERFORM] Fast tsearch2, trigram matching on short phrases

2007-08-22 Thread Steinar H. Gunderson
On Wed, Aug 22, 2007 at 12:02:54PM -0400, Carlo Stonebanks wrote:
 Any suggestions on where to go with this project to improve performance 
 would be greatly appreciated.

I'm a bit unsure from reading your mail -- have you tried pg_trgm with a GiST
index?

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] Poor Performance after Upgrade

2007-08-21 Thread Steinar H. Gunderson
On Mon, Aug 20, 2007 at 10:17:14PM -0700, Ben Perrault wrote:
  -  Nested Loop  (cost=1.00..106559138.00 rows=336 
 width=137)

This sounds very much like you're trying to force the planner. Did you set
enable_nestloop=false or something? Are there any other non-default settings
that could negatively impact planner performance?

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] schema design question

2007-08-19 Thread Steinar H. Gunderson
On Sun, Aug 19, 2007 at 11:12:16AM -0700, David Fetter wrote:
 There's your mistake.  EAV is not performant, and won't become so.

It sort of depends. I put all the EXIF information for my image gallery into
an EAV table -- it was the most logical format at the time, although I'm not
sure I need all the information. Anyhow, with clustering and indexes,
Postgres zips through the five million records easily enough for my use -- at
least fast enough that I can live with it without feeling the need for a
redesign.

As a general database design paradigm, though, I fully agree with you.
Databases are databases, not glorified OO data stores or hash tables.

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] schema design question

2007-08-19 Thread Steinar H. Gunderson
On Sun, Aug 19, 2007 at 11:41:15AM -0700, David Fetter wrote:
 Unless your records are huge, that's a tiny database, where tiny is
 defined to mean that the whole thing fits in main memory with plenty
 of room to spare.  I guarantee that performance will crash right
 through the floor as soon as any table no longer fits in main memory.

Sure, it fits into memory; however, it isn't used so often, though, so it's
frequently not in the cache when it's needed. You are completely right in
that it's much slower from disk than from RAM :-)

The question is, of course, how to best store something like the EXIF
information _without_ using EAV. I could separate out the few fields I
normally use into a horizontal (ie. standard relational) table, but it seems
sort of... lossy? Another possible approach is to keep the EAV table around
for completeness in addition to the few fields I need, but then you do of
course get into normalization issues.

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [PERFORM] Integrated perc 5/i

2007-08-16 Thread Steinar H. Gunderson
On Thu, Aug 16, 2007 at 11:26:52AM +0300, Michael Ben-Nes wrote:
 Does any one have any experience with RAID 10  perc 5/i ?

Without having done PostgreSQL benchmarking, we have a 2950 with four SATA
disks in RAID 10 (and two SAS disks in RAID 1), and have not seen any
performance issues.

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Integrated perc 5/i

2007-08-16 Thread Steinar H. Gunderson
On Thu, Aug 16, 2007 at 10:53:00AM -0700, Luke Lonergan wrote:
 They have a setting that sounds like RAID10, but it actually
 implements spanning of mirrors.

That's interesting. I'm pretty sure it actually says RAID10 in the BIOS,
but is this a lie?

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] How to ENABLE SQL capturing???

2007-08-08 Thread Steinar H. Gunderson
On Wed, Aug 08, 2007 at 01:02:24PM -0700, smiley2211 wrote:
 I am trying to enable capturing of the submitted code via an
 application...how do I do this in Postgres?  Performance is SLOW on my
 server and I have autovacuum enabled as well as rebuilt indexes...whatelse
 should be looked at?

Try log_min_duration_statement = 100 in postgresql.conf; it will show all
statements that take more than 100ms. Set to 0 to log _all_ statements, or
-1 to turn the logging back off.

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] FORGOT TO CONFIGURE RAID! DELL POWEREDGE 2950

2007-07-14 Thread Steinar H. Gunderson
On Sat, Jul 14, 2007 at 10:29:05AM +0200, Hannes Dorbath wrote:
 From the DELL site it seems this `PERC 5/i' on board controller
 (assuming that's what you have) doesn't even have a BBU. If you don't
 plan to post here in a few weeks again about data corruption, go out and
 shop a serious controller.

We have a 2950 with a PERC, and it has a BBU.

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] TRUNCATE TABLE

2007-07-13 Thread Steinar H. Gunderson
On Fri, Jul 13, 2007 at 06:17:18PM +0200, Adriaan van Os wrote:
 The hardware is an Intel dual-core 17-inch MacBook Pro running Mac 
 OS X 10.4.

To isolate things, have you tried testing a different operating system?

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] PostgreSQL publishes first real benchmark

2007-07-09 Thread Steinar H. Gunderson
On Mon, Jul 09, 2007 at 11:57:13AM -0400, Jignesh K. Shah wrote:
 I think this result will be useful for performance discussions of 
 postgresql against other databases.

 http://www.spec.org/jAppServer2004/results/res2007q3/

Am I right if this is for a T2000 (Niagara) database server? It sure is
interesting, but I can't help thinking it's not a very common
configuration...

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] test / live environment, major performance difference

2007-06-12 Thread Steinar H. Gunderson
On Tue, Jun 12, 2007 at 03:32:40PM +0200, Christo Du Preez wrote:
 As an example I took a table with about 650 rows, having a parentid
 field with an index on parentid.

Try a bigger table. Using an index for only 650 rows is almost always
suboptimal, so it's no wonder the planner doesn't use the index.

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] test / live environment, major performance difference

2007-06-12 Thread Steinar H. Gunderson
On Tue, Jun 12, 2007 at 04:11:33PM +0200, Christo Du Preez wrote:
 The actual table I noticed the problem has a million rows and it still
 doesn't use indexing

Then please post an EXPLAIN ANALYZE of the query that is slow, along with the
table definition and indexes.

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Weird 8.2.4 performance

2007-06-07 Thread Steinar H. Gunderson
On Wed, Jun 06, 2007 at 07:27:27PM -0400, Kurt Overberg wrote:
 This query runs great on production under 8.0 (27ms), but under 8.2.4  
 (on my mac) I'm seeing times in excess of 50,000ms.  Note that on  
 8.2.4, if I run the query again, it gets successively faster  
 (50,000ms-6000ms-27ms).  Is this normal?

Your production server probably has all the data in your cache, and your Mac
has not. Furthermore, they seem to be running on different data sets, judging
from your EXPLAIN ANALYZE.

How big did you say these tables were?

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Weird 8.2.4 performance

2007-06-07 Thread Steinar H. Gunderson
On Thu, Jun 07, 2007 at 11:35:27AM +0200, Steinar H. Gunderson wrote:
 How big did you say these tables were?

Sorry, you already said that -- 650k rows for one of them. If that table
doesn't fit in the cache on your Mac, you pretty much lose. From the EXPLAIN
output, it looks like it fits very nicely in cache on your server. Thus, I
don't think the difference is between 8.0 and 8.2, but rather your production
server and your test machine.

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] Getting Slow

2007-06-07 Thread Steinar H. Gunderson
On Thu, Jun 07, 2007 at 01:48:43PM -0400, Joe Lester wrote:
 - The server log shows frequent archived transaction log file  
 entries. Usually once every 10 minutes or so, but sometimes 2 or 3  
 per minute.

Sounds like you've got a lot of writes going. You might want more power in
your I/O?

 Operating System: Mac OS 10.4.7 Client

Is there a particular reason for this? It's not known to be the best server
OS around -- it's hard to say that an OS change would do anything for your
problem, but it looks like an unusual choice.

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] Performance Problem

2007-06-05 Thread Steinar H. Gunderson
On Tue, Jun 05, 2007 at 03:23:35PM +0530, Gauri Kanekar wrote:
 Befor doing vaccum full on the database this query use to take less
 than 4min. But now after doing vacumming  reindexing the tables it is
 taking 73mins.

Did you analyze the table recently? Some of the selectivity estimates seem
quite a bit off -- you could try raising the statistics target.

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] performance drop on 8.2.4, reverting to 8.1.4

2007-06-05 Thread Steinar H. Gunderson
On Tue, Jun 05, 2007 at 05:30:14PM -0400, Steven Flatt wrote:
 (A) LEFT JOIN (B) ON col WHERE B.col IS NULL
 
 These queries are much slower on 8.2 than on 8.1 for what looks like the
 reason outlined above.  I have rewritten a few key queries to be of the
 equivalent form:
 
 (A) WHERE col NOT IN (SELECT col FROM (B))

At least those _can_ be rewritten into a sane form. I have an application
with a large FULL OUTER JOIN, where _both_ sides can return NULLs. (It's
basically a diff between a current and a wanted state.)

It performs reasonably well under both 8.1 and 8.2, though. Fourteen-table
join or so :-)

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] Thousands of tables versus on table?

2007-06-05 Thread Steinar H. Gunderson
On Tue, Jun 05, 2007 at 05:59:25PM -0400, Tom Lane wrote:
 I think the main argument for partitioning is when you are interested in
 being able to drop whole partitions cheaply.

Wasn't there also talk about adding the ability to mark individual partitions
as read-only, thus bypassing MVCC and allowing queries to be satisfied using
indexes only?

Not that I think I've seen it on the TODO... :-)

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] Autodetect of software RAID1+0 fails

2007-06-01 Thread Steinar H. Gunderson
On Fri, Jun 01, 2007 at 10:57:56AM -0700, Craig James wrote:
 The Linux kernel doesn't properly detect my software RAID1+0 when I boot 
 up.  It detects the two RAID1 arrays, the partitions of which are marked 
 properly.  But it can't find the RAID0 on top of that, because there's no 
 corresponding device to auto-detect.  The result is that it creates 
 /dev/md0 and /dev/md1 and assembles the RAID1 devices on bootup, but 
 /dev/md2 isn't created, so the RAID0 can't be assembled at boot time.

Either do your md discovery in userspace via mdadm (your distribution can
probably help you with this), or simply use the raid10 module instead of
building raid1+0 yourself.

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] setting up raid10 with more than 4 drives

2007-05-31 Thread Steinar H. Gunderson
On Wed, May 30, 2007 at 12:41:46AM -0400, Jonah H. Harris wrote:
 Yeah, I've never seen a way to RAID-1 more than 2 drives either.

pannekake:~ grep -A 1 md0 /proc/mdstat 
md0 : active raid1 dm-20[2] dm-19[1] dm-18[0]
  64128 blocks [3/3] [UUU]

It's not a big device, but I can ensure you it exists :-)

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] general PG network slowness (possible cure) (repost)

2007-05-25 Thread Steinar H. Gunderson
On Fri, May 25, 2007 at 10:50:58AM +0200, Peter T. Breuer wrote:
 I set up pg to replace a plain gdbm database for my application.

Postgres and gdbm are completely different. You want to rethink your queries
so each does more work, instead of running a zillion of them over the network.

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Performance problem on 8.2.4, but not 8.2.3

2007-05-25 Thread Steinar H. Gunderson
On Fri, May 25, 2007 at 03:56:35PM -0400, Tom Lane wrote:
 I looked through the CVS logs and didn't find any planner changes
 between 8.2.3 and 8.2.4 that seem likely to affect your query, so
 I'm thinking it must be a statistical discrepancy.

It looks like the estimated cost is lower for 8.2.4 -- could it be that the
fact that he's giving it more memory lead to the planner picking a plan that
happens to be worse?

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] Performace comparison of indexes over timestamp fields

2007-05-22 Thread Steinar H. Gunderson
On Tue, May 22, 2007 at 02:39:33PM +0200, Alexander Staubo wrote:
 PostgreSQL uses B-tree indexes for scalar values. For an expression
 such as t between a and b, I believe it's going to match both sides
 of the table independently (ie., t = a and t = b) and intersect
 these subsets. This is inefficient.

A B-tree index can satisfy range queries such as this.

 You should get better performance by mapping timestamps to a
 one-dimensional plane and indexing them using GiST. GiST implements an
 R-tree-like structure that supports bounding-box searches.

You may be thinking of interval overlaps?

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] performance drop on 8.2.4, reverting to 8.1.4

2007-05-18 Thread Steinar H. Gunderson
On Fri, May 18, 2007 at 12:02:44PM +0300, Liviu Ionescu wrote:
 the 8.2.4 plan with join_collapse_limit = 1 (with default it was worse, full 
 of nested loops)

It will probably be useful with EXPLAIN ANALYZE of your queries, not just the
EXPLAIN.

 Nested Loop Left Join  (cost=32.01..2012.31 rows=1 width=230)

It looks like the planner thinks this is going to be really cheap -- so it's
misestimating something somewhere. Have you ANALYZEd recently?

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] performance drop on 8.2.4, reverting to 8.1.4

2007-05-18 Thread Steinar H. Gunderson
On Fri, May 18, 2007 at 01:14:56PM +0300, Liviu Ionescu wrote:
 yes, but to be sure I did it again before issuing the request; no 
 improvements...

Is this with the join collapse limit set to 1, or with default? (Default is
generally more interesting.)

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] performance drop on 8.2.4, reverting to 8.1.4

2007-05-18 Thread Steinar H. Gunderson
On Fri, May 18, 2007 at 02:05:36PM +0300, Liviu Ionescu wrote:
 -  Hash Left Join  
 (cost=2.44..63.29 rows=1 width=49) (actual time=0.361..14.426 rows=2206 
 loops=1)
   Hash Cond: (n.nodeid = 
 templates.nodeid)
   Filter: (templates.nodeid 
 IS NULL)
   -  Seq Scan on nodes n  
 (cost=0.00..51.70 rows=2270 width=49) (actual time=0.071..4.417 rows=2270 
 loops=1)
   -  Hash  (cost=1.64..1.64 
 rows=64 width=4) (actual time=0.152..0.152 rows=64 loops=1)
 -  Seq Scan on 
 templates  (cost=0.00..1.64 rows=64 width=4) (actual time=0.032..0.082 
 rows=64 loops=1)

This seems to be the source of the misestimation. You might want to try using
n WHERE n.nodein NOT IN (SELECT nodeid FROM templates) instead of n LEFT
JOIN templates USING (nodeid) WHERE templates.nodeid IS NULL and see if it
helps.

 Total runtime: 231929.656 ms

Note that this is better than the version with collapse_limit set to 1. :-)

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] performance drop on 8.2.4, reverting to 8.1.4

2007-05-18 Thread Steinar H. Gunderson
On Fri, May 18, 2007 at 02:51:42PM +0300, Liviu Ionescu wrote:
 it helped, the new version of the query takes 2303 ms on both 8.1.4 and 8.2.4.

And the old one?

 any idea why the 8.2.4 planner is not happy with the initial select? was it
 just a big chance that it worked in 8.1.4 or the 8.2.4 planner has a
 problem?

I guess it was more or less by chance, especially as 8.1 did not reorder
outer joins. Others might know more about the estimation, though.

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] CPU Intensive query

2007-05-18 Thread Steinar H. Gunderson
On Fri, May 18, 2007 at 09:02:52AM -0700, Abu Mushayeed wrote:
 I have an interesting problem. I have the following query that ran ok on
 Monday and Tuesday and it has been running ok since I have been at this
 job. I have seen it to be IO intensive, but since Wednesday it has become
 CPU intensive. Database wise fresh data has been put into the tables,
 vacuumed  analyzed, no other parameter has been modified.

What Postgres version is this?

   The query is as follows and it's explain plan is also attached:

Normally EXPLAIN ANALYZE data would be much better than EXPLAIN, but if the
query indeed does not finish, it's not going to help much.

   set enable_nestloop = off; 

What's the rationale for this?

   HashAggregate  (cost=152555.97..152567.32 rows=267 width=162)

152000 disk page fetches is a bit, but it shouldn't take 24 hours. There's
probably misestimation involved at some point here. Does it really return 267
rows, or many more?

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] CPU Intensive query

2007-05-18 Thread Steinar H. Gunderson
On Fri, May 18, 2007 at 02:37:27PM -0700, Abu Mushayeed wrote:
 set enable_nestloop = off; 
 What's the rationale for this?
 To eliminate nested loop. It does a nested loop betwwen to very large
 table(millions of rows).

If the planner chooses a nested loop, it is because it believes it is the
most efficient solution. I'd turn it back on and try to figure out why the
planner was wrong. Note that a nested loop with an index scan on one or both
sides can easily be as efficient as anything.

Did you ANALYZE your tables recently? If the joins are really between
millions of rows and the planner thinks it's a couple thousands, the stats
sound rather off... 

 HashAggregate (cost=152555.97..152567.32 rows=267 width=162)
 152000 disk page fetches is a bit, but it shouldn't take 24 hours. There's
 probably misestimation involved at some point here. Does it really return 267
 rows, or many more?
 It returns finally about 19-20 thousand rows.

So the planner is off by a factor of at least a hundred. That's a good
first-level explanation for why it's slow, at least...

If you can, please provide EXPLAIN ANALYZE output for your query (after
running ANALYZE on all your tables, if you haven't already); even though
it will take some time, it usually makes this kind of performance debugging
much easier.

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] CPU Intensive query

2007-05-18 Thread Steinar H. Gunderson
On Sat, May 19, 2007 at 12:32:33AM +0200, Steinar H. Gunderson wrote:
 Did you ANALYZE your tables recently? If the joins are really between
 millions of rows and the planner thinks it's a couple thousands, the stats
 sound rather off... 

Sorry, I forgot your first e-mail where you said you had both vacuumed and
analyzed recently. The estimates are still off, though -- the WHERE query
might be difficult to estimate properly. (I'm not sure how Tom arrived on
his conclusion of expensive user-defined functions, but given the usual
precisions of his guesses, I'd check that too...)

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Best OS for Postgres 8.2

2007-05-09 Thread Steinar H. Gunderson
On Wed, May 09, 2007 at 01:57:51AM -0700, [EMAIL PROTECTED] wrote:
 given that RAID, snapshots, etc are already in the linux kernel, I suspect 
 that what will need to happen is for the filesystem to be ported without 
 those features and then the userspace tools (that manipulate the volumes ) 
 be ported to use the things already in the kernel.

Well, part of the idea behind ZFS is that these parts are _not_ separated in
layers -- for instance, the filesystem can push data down to the RAID level
to determine the stripe size used.

Whether this is a good idea is of course hotly debated, but I don't think you
can port just the filesystem part and call it a day.

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Best OS for Postgres 8.2

2007-05-08 Thread Steinar H. Gunderson
On Mon, May 07, 2007 at 11:56:14PM -0400, Greg Smith wrote:
 Debian packages PostgreSQL in a fashion unique to it; it's arguable 
 whether it's better or not (I don't like it), but going with that will 
 assure your installation is a bit non-standard compared with most Linux 
 installas.  The main reasons you'd pick Debian are either that you like 
 that scheme (which tries to provide some structure to running multiple 
 clusters on one box), or that you plan to rely heavily on community 
 packages that don't come with the Redhat distributions and therefore would 
 appreciate how easy it is to use apt-get against the large Debian software 
 repository.

Just to add to this: As far as I understand it, this scheme was originally
mainly put in place to allow multiple _versions_ of Postgres to be installed
alongside each other, for smoother upgrades. (There's a command that does all
the details of running first pg_dumpall for the users and groups, then the
new pg_dump with -Fc to get all data and LOBs over, then some hand-fixing to
change explicit paths to $libdir, etc...)

Of course, you lose all that if you need a newer Postgres version than the OS
provides. (Martin Pitt, the Debian/Ubuntu maintainer of Postgres -- the
packaging in Debian and Ubuntu is the same, sans version differences -- makes
his own backported packages of the newest Postgres to Debian stable; it's up
to you if you'd trust that or not.)

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [PERFORM] Best OS for Postgres 8.2

2007-05-08 Thread Steinar H. Gunderson
On Mon, May 07, 2007 at 03:14:08PM -0700, Joshua D. Drake wrote:
 It is my understanding (and I certainly could be wrong) that FreeBSD
 doesn't handle SMP nearly as well as Linux (and Linux not as well as
 Solaris).

I'm not actually sure about the last part. There are installations as big as
1024 CPUs that run Linux -- most people won't need that, but it's probably an
indicator that eight cores should run OK :-)

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] specific query (not all) on Pg8 MUCH slower than Pg7

2007-05-08 Thread Steinar H. Gunderson
On Tue, May 08, 2007 at 10:18:34AM -0400, Susan Russo wrote:
 explain analyze output on Pg7.3.2:
 
-  Index Scan using dbxref_idx2 on dbxref dx  
 (cost=0.00..5.83 rows=1 width=21) (actual time=25.58..25.58 rows=0 loops=1)
  Index Cond: ((accession = 'AY851043'::character 
 varying) AND (accession  'AY851044'::character varying))
  Filter: (accession ~~ 'AY851043%'::text)
 
 explain analyze output on Pg8.1.4:
 
-  Seq Scan on dbxref dx  (cost=0.00..47923.91 rows=1 
 width=21) (actual time=2463.646..2463.646 rows=0 loops=1)
  Filter: ((accession)::text ~~ 'AY851043%'::text)

This is almost all of your cost. Did you perchance initdb the 8.1.4 cluster
in a non-C locale? You could always try

  CREATE INDEX test_index ON dbxref (accession varchar_pattern_ops);

which would create an index that might be more useful for your LIKE query,
even in a non-C locale.

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [PERFORM] DISTINCT Question

2007-05-08 Thread Steinar H. Gunderson
On Tue, May 08, 2007 at 12:52:35PM -0700, Y Sidhu wrote:
 Does using DISTINCT in a query force PG to abandon any index search it might
 have embarked upon?

No.

If you need help with a specific query, please post it, along with your table
definitions and EXPLAIN ANALYZE output.

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Throttling PostgreSQL's CPU usage

2007-05-08 Thread Steinar H. Gunderson
On Tue, May 08, 2007 at 04:27:10PM -0400, Daniel Griscom wrote:
 3: ... some other solution I haven't thought of.

On a wild guess, could you try setting the CPU costs higher, to make the
planner choose a less CPU-intensive plan?

Other (weird) suggestions would include calling a user-defined function that
sleep()ed for you between every row. Or use a dual-core system. :-)

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] What's The Difference Between VACUUM and VACUUM ANALYZE?

2007-05-08 Thread Steinar H. Gunderson
On Tue, May 08, 2007 at 05:52:13PM -0400, Alvaro Herrera wrote:
 I am trying to follow a message thread. One guy says we should be running
 vacuum analyze daily and the other says we should be running vacuum multiple
 times a day. I have tried looking for what a vacuum analyze is to help me
 understand but no luck.
 VACUUM ANALYZE is like VACUUM, except that it also runs an ANALYZE
 afterwards.

Shoot me if I'm wrong here, but doesn't VACUUM ANALYZE check _all_ tuples,
as compared to the random selection employed by ANALYZE?

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Throttling PostgreSQL's CPU usage

2007-05-08 Thread Steinar H. Gunderson
On Tue, May 08, 2007 at 06:32:14PM -0400, Carlos Moreno wrote:
 Or use a dual-core system. :-)
 Am I missing something??  There is just *one* instance of this idea in,
 what, four replies??  I find it so obvious, and so obviously the only
 solution that has any hope to work, that it makes me think I'm missing
 something ...

Actually, it should be added that this suggestion was only partially
tongue-in-cheek. I wrote a 3D application as part of an internship a couple
of years ago, and it had a problem that worked vaguely like the given
scenario: Adding a background task (in this case the task that loaded in new
pieces of terrain) would kill the framerate for the user, but nicing down
(actually, down-prioritizing, as this was on Windows) the back-end would
starve it completely of cycles. The solution was to just define that this
would only be run on multiprocessor systems, where both tasks would chug
along nicely :-)

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] How to Find Cause of Long Vacuum Times - NOOB Question

2007-05-05 Thread Steinar H. Gunderson
On Sat, May 05, 2007 at 03:57:25PM -0700, Yudhvir Singh Sidhu wrote:
 Situation:  huge amounts of adds and deletes daily. Running daily vacuums

If you have huge amounts of adds and deletes, you might want to vacuum more
often; optionally, look into autovacuum.

 Problem:  Vacuum times jump up from 45 minutes, or 1:30 minutes to 6+ 
 hours overnight, once every 1 to 3 months.

You might want to check your FSM settings. Take a look at the output of
VACUUM VERBOSE and see how the results stack up against your FSM settings.
Optionally, you could do a VACUUM FULL to clear the bloat, but this will lock
the tables and is not recommended on a regular basis.

 I know my indexes are getting fragmented and my tables are getting 
 fragmented. 

This sounds like a case of table bloat, ie. vacuuming too seldom and/or too
low FSM settings.

 I also know that some of my btree indexes are not being used in queries.

This is a separate problem, usually; if you need help with a specific query,
post query and the EXPLAIN ANALYZE output here. (Note that using an index is
not always a win; Postgres' planner knows about this and tries to figure out
when it is a win and when it is not.)

 I also know that using UNIQUE in a query makes PG ignore any index.

Do you mean DISTINCT? There are known problems with SELECT DISTINCT, but I'm
not sure how it could make Postgres start ignoring an index. Again, it's a
separate problem.

 I am looking for the cause of this. Recently I have been looking at 
 EXPLAIN and ANALYZE.

This is a good beginning. :-)

 1.  Running EXPLAIN on a query tells me how my query SHOULD run and 
 running ANALYZE tells me how it DOES run. Is that correct?

Nearly. EXPLAIN tells you how the plan Postgres has chosen, with estimates on
the costs of each step. EXPLAIN ANALYZE (just plain ANALYZE is a different
command, which updates the planner's statistics) does the same, but also runs
the query and shows the time each step ended up taking. (Note that the
units of the estimates and the timings are different, so you can't compare
them directly.)

 2.  If (1) is true, then a difference between the two means my query 
 plan is messed up and running ANALYZE on a table-level will somehow 
 rebuild the plan. Is that correct?

Again, sort of right, but not entirely. ANALYZE updates the planner's
statistics. Having good statistics is very useful for the planner in
selecting the plan that actually ends up being the best.

 3.  If (2) is correct, then  running ANALYZE on a nightly basis before 
 running vacuum will keep vacuum times down. Is that correct?

No, ANALYZE will only update planner statistics, which has nothing to do with
vacuum times. On the other hand, it might help with some of your queries.

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Intermitent slow queries

2007-05-02 Thread Steinar H. Gunderson
On Wed, May 02, 2007 at 02:55:26PM -0400, Ron wrote:
 The fastest test, and possible fix, is to go and buy more RAM.  See 
 if 16MB of RAM, heck even 10MB, makes the problem go away or delays 
 it's onset.

Something tells me 16MB of RAM is not going to help him much? :-)

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Query performance problems with partitioned tables

2007-04-30 Thread Steinar H. Gunderson
On Mon, Apr 30, 2007 at 03:29:30PM +0200, Andreas Haumer wrote:
 This already gives a row matching the given WHERE clause.
 It makes no sense to scan the other tables, as the query
 asks for one row only and all the other tables have timestamps
 larger than all the timestamps in table t_mv_200601 (according
 to the CHECK constraints for the partion tables)

So for each row, it has to check all CHECK constraints to see if it has
enough rows? That sounds fairly inefficient.

I wonder if the planner could copy the limit down through the Append, though
-- it certainly doesn't need more than one row from each partition. It sounds
slightly cumbersome to try to plan such a thing, though...

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [PERFORM] index usage

2007-04-23 Thread Steinar H. Gunderson
On Mon, Apr 23, 2007 at 07:20:29PM +0200, Arkadiusz Raj wrote:
 I have a table in my database that is updated every minute with new acquired

 data. Anyway there is a query to get latest values to be displayed on

 screen. I have postgresql 7.4.2 that work very fine.

You want _at least_ the latest 7.4 version -- ideally, the latest 8.2
version.

 The problem was that

 after hdd crash I have rebuild database from the archive and... Execution

 time of this query starts to be unacceptable.

Have you re-ANALYZEd after the data load?

Anyhow, the issue with the planner not knowing how to estimate expressions
like now() - interval '5 minutes' correctly is a known 7.4 issue, and it's
fixed in later versions. It might have worked more or less by accident
earlier, although it seems odd that it wouldn't even have considered the
index scan...

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Opinions on Raid

2007-03-05 Thread Steinar H. Gunderson
On Sat, Mar 03, 2007 at 12:30:16PM +0100, Arjen van der Meijden wrote:
 If you have a MegaCLI-version, I'd like to see it, if possible? That 
 would definitely save us some reinventing the wheel  :-)

A friend of mine just wrote

  MegaCli -AdpAllInfo -a0|egrep '  (Degraded|Offline|Critical Disks|Failed 
Disks)' | grep -v ': 0 $'

which will output errors if there are any, and none otherwise. Or just add -q
to the grep and check the return status.

(Yes, simplistic, but often all you want to know is if all's OK or not...)

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] pg_trgm performance

2007-02-24 Thread Steinar H. Gunderson
On Sat, Feb 24, 2007 at 02:04:36AM +0100, Guillaume Smet wrote:
 Could you post EXPLAIN ANALYZE for both queries (after 2 or 3 runs)?

GIST version, short:

amarok=# explain analyze select count(*) from tags where title % 'foo';
QUERY PLAN  
  
--
 Aggregate  (cost=147.84..147.85 rows=1 width=0) (actual time=16.873..16.875 
rows=1 loops=1)
   -  Bitmap Heap Scan on tags  (cost=4.59..147.74 rows=41 width=0) (actual 
time=16.828..16.850 rows=7 loops=1)
 Recheck Cond: (title % 'foo'::text)
 -  Bitmap Index Scan on trgm_idx  (cost=0.00..4.58 rows=41 width=0) 
(actual time=16.818..16.818 rows=7 loops=1)
   Index Cond: (title % 'foo'::text)
 Total runtime: 16.935 ms
(6 rows)

GiN version, short:

amarok=# explain analyze select count(*) from tags where title % 'foo';
QUERY PLAN  
   
---
 Aggregate  (cost=151.89..151.90 rows=1 width=0) (actual time=30.197..30.199 
rows=1 loops=1)
   -  Bitmap Heap Scan on tags  (cost=8.64..151.79 rows=41 width=0) (actual 
time=5.555..30.157 rows=7 loops=1)
 Filter: (title % 'foo'::text)
 -  Bitmap Index Scan on trgm_idx  (cost=0.00..8.63 rows=41 width=0) 
(actual time=2.857..2.857 rows= loops=1)
   Index Cond: (title % 'foo'::text)
 Total runtime: 30.292 ms
(6 rows)


GIST version, medium:

amarok=# explain analyze select count(*) from tags where title % 'chestnuts 
roasting on an 0pen fire';
 QUERY PLAN 


 Aggregate  (cost=147.84..147.85 rows=1 width=0) (actual time=216.149..216.151 
rows=1 loops=1)
   -  Bitmap Heap Scan on tags  (cost=4.59..147.74 rows=41 width=0) (actual 
time=216.135..216.137 rows=1 loops=1)
 Recheck Cond: (title % 'chestnuts roasting on an 0pen fire'::text)
 -  Bitmap Index Scan on trgm_idx  (cost=0.00..4.58 rows=41 width=0) 
(actual time=216.124..216.124 rows=1 loops=1)
   Index Cond: (title % 'chestnuts roasting on an 0pen fire'::text)
 Total runtime: 216.214 ms
(6 rows)


amarok=# explain analyze select count(*) from tags where title % 'chestnuts 
roasting on an 0pen fire';
 QUERY PLAN 
 
-
 Aggregate  (cost=151.89..151.90 rows=1 width=0) (actual time=156.310..156.312 
rows=1 loops=1)
   -  Bitmap Heap Scan on tags  (cost=8.64..151.79 rows=41 width=0) (actual 
time=156.205..156.299 rows=1 loops=1)
 Filter: (title % 'chestnuts roasting on an 0pen fire'::text)
 -  Bitmap Index Scan on trgm_idx  (cost=0.00..8.63 rows=41 width=0) 
(actual time=155.748..155.748 rows=36 loops=1)
   Index Cond: (title % 'chestnuts roasting on an 0pen fire'::text)
 Total runtime: 156.376 ms
(6 rows)


GIST version, long:

amarok=# explain analyze select count(*) from tags where title % 
'Donaueschingen (Peter Kruders 
Donaudampfschifffahrtsgesellschaftskapitänskajütenremix)'; 
;
  QUERY PLAN
  
--
 Aggregate  (cost=147.84..147.85 rows=1 width=0) (actual time=597.115..597.117 
rows=1 loops=1)
   -  Bitmap Heap Scan on tags  (cost=4.59..147.74 rows=41 width=0) (actual 
time=597.102..597.104 rows=1 loops=1)
 Recheck Cond: (title % 'Donaueschingen (Peter Kruders 
Donaudampfschifffahrtsgesellschaftskapitänskajütenremix)'::text)
 -  Bitmap Index Scan on trgm_idx  (cost=0.00..4.58 rows=41 width=0) 
(actual time=597.093..597.093 rows=1 loops=1)
   Index Cond: (title % 'Donaueschingen (Peter Kruders 
Donaudampfschifffahrtsgesellschaftskapitänskajütenremix)'::text)
 Total runtime: 597.173 ms
(6 rows)


GiN version, long:

amarok=# explain analyze select count(*) from tags where title % 
'Donaueschingen (Peter Kruders 
Donaudampfschifffahrtsgesellschaftskapitänskajütenremix)'; 
;
  QUERY PLAN
  

Re: [PERFORM] which Xeon processors don't have the context switching problem

2007-02-23 Thread Steinar H. Gunderson
On Fri, Feb 23, 2007 at 02:05:57PM -0500, Geoffrey wrote:
 In searching the archives, I can't find any specific info indentifying 
 which Xeon processors don't have this problem.

AFAIK the cut-off point is at the Woodcrests. They are overall much better
suited to PostgreSQL than the older Xeons were.

It's slightly unfortunate that AMD and Intel cling to the Opteron and Xeon
names even though they're making significant architecture changes, but that's
life, I guess.

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] which Xeon processors don't have the context switching problem

2007-02-23 Thread Steinar H. Gunderson
On Fri, Feb 23, 2007 at 04:53:18PM -0300, Alvaro Herrera wrote:
 It's slightly unfortunate that AMD and Intel cling to the Opteron and Xeon
 names even though they're making significant architecture changes, but that's
 life, I guess.
 AFAIR Intel has been calling their server processors Xeon since Pentium
 Pro's, at least.

Yes, that was sort of my point. :-)

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] pg_trgm performance

2007-02-23 Thread Steinar H. Gunderson
On Sat, Feb 24, 2007 at 12:09:41AM +0100, Guillaume Smet wrote:
 Could you try to see if the GIN implementation of pg_trgm is faster in
 your cases?

I'm sorry, I can no longer remember where I needed pg_trgm. Simple testing of
your patch seems to indicate that the GiN version is about 65% _slower_ (18ms
vs. 30ms) for a test data set I found lying around, but I remember that on
the data set I needed it, the GIST version was a lot slower than that (think
3-400ms). The 18 vs. 30ms test is a random Amarok database, on 8.2.3
(Debian).

Sorry I couldn't be of more help.

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] slow update on 1M rows (worse with indexes)

2007-02-22 Thread Steinar H. Gunderson
On Thu, Feb 22, 2007 at 07:11:42PM +0100, Gabriel Biberian wrote:
 Create a new DB and load a dump of the above database with 976009 rows, 
 then i perform updates on the whole table.  I recorded the time taken 
 for each full update and the amount of extra disk space used.  Each 
 consecutive update of the table is slower than the previous
 [EMAIL PROTECTED]:~$ time psql TOTO -c UPDATE versions_9d SET flag=2
 UPDATE 976009
 real0m41.542s

You're creating a huge amount of dead rows by this kind of procedure. Try a
VACUUM in-between, or enable autovacuum. (Adjusting your WAL and
checkpointing settings might help too.)

Apart from that, do you really have a scenario that requires updating _all_
rows in your table regularly?

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Vacuum full very slow due to nonremovable dead rows...What makes the dead rows non-removable?

2007-02-22 Thread Steinar H. Gunderson
On Thu, Feb 22, 2007 at 12:19:50PM -0800, Barbara Cosentino wrote:
 I have a table that gets many rows inserted, updated and then deleted,
 consistently throughout the day.  At any point in time this table should
 have no more than 50 actual rows and many times a direct select against
 this table produces no rows.  This table also has a VACUUM FULL ANALYZE
 performed against it about very 30 minutes.

You should not usually need VACUUM FULL; doing so all the time will probably
_decrease_ your performance.

 I noticed the vacuum was taking a considerable amount of time for a table
 with a small number of actual rows.  The output of the first vacuum full
 analyze verbose I performed showed that this table had 3,699,704 dead row
 versions that could not be removed.  This number of dead rows that could
 not be released increased with each vacuum full that was performed. The
 output of the last vacuum full is shown below.  

Do you have any long-running transactions going? Those are likely to make
rows nonremovable. Look for idle workers in a transaction.

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Not Picking Index

2007-02-16 Thread Steinar H. Gunderson
On Fri, Feb 16, 2007 at 01:27:46PM -0500, Brad Nicholson wrote:
 If the indexed field is an intger, add 0 to it.

Won't that also invalidate the statistics?

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] trouble with a join on OS X

2007-02-02 Thread Steinar H. Gunderson
On Fri, Feb 02, 2007 at 07:52:48AM -0600, Kirk Wythers wrote:
 psql(15811) malloc: *** vm_allocate(size=8421376) failed (error code=3)
 psql(15811) malloc: *** error: can't allocate region
 psql(15811) malloc: *** set a breakpoint in szone_error to debug

It sounds like you are out of memory. Have you tried reducing work_mem?
Actually, what does your postgresql.conf look like with regard to memory
settings?

 This a a dual G5 box with 6 gigs of ram running postgresql 8.1. I  
 have not tired altering kernel resources (as described in http:// 
 www.postgresql.org/docs/8.1/interactive/kernel-resources.html#SHARED- 
 MEMORY-PARAMETERS), or compiling for 64 bit. I'm just not sure what  
 to try next. Does anyone have any suggestions?

Compiling for 64 bit might very well help you, but it sounds odd to use
several gigabytes of RAM for a sort.

Could you post EXPLAIN ANALYZE for the query with only one row, as well
as your table schema?

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] trouble with a join on OS X

2007-02-02 Thread Steinar H. Gunderson
On Fri, Feb 02, 2007 at 10:05:29AM -0600, Kirk Wythers wrote:
 Thanks Tom... Any suggestions as to how much to raise ulimit -d? And  
 how to raise ulimit -d?

Try multiplying it by 100 for a start:

  ulimit -d 614400

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [PERFORM] slow result

2007-01-23 Thread Steinar H. Gunderson
On Tue, Jan 23, 2007 at 11:34:52AM +0100, Laurent Manchon wrote:
 I have a slow response of my PostgreSQL database 7.4 using this query below
 on a table with 80 rows:
 
 select count(*)from tbl;

Contrary to your expectations, this is _not_ a query you'd expect to be fast
in Postgres. Try real queries from your application instead -- most likely,
you'll find them to be much master. (If not, come back with the query, the
schema and the EXPLAIN ANALYZE output of your query, and you'll usually get
help nailing down the issues. :-) )

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [PERFORM] slow result

2007-01-23 Thread Steinar H. Gunderson
On Tue, Jan 23, 2007 at 11:55:41AM +0100, Steinar H. Gunderson wrote:
 you'll find them to be much master.

s/master/faster/

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] pg_trgm performance

2007-01-15 Thread Steinar H. Gunderson
On Mon, Jan 15, 2007 at 11:16:36AM +0100, Florian Weimer wrote:
 Am I missing something?  Or are trigrams just a poor match for my data
 set?  Are the individual strings too long, maybe?

FWIW, I've seen the same results with 8.1.x.

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Large table performance

2007-01-12 Thread Steinar H. Gunderson
On Fri, Jan 12, 2007 at 07:40:25PM -0500, Dave Cramer wrote:
 5000 is pretty low, you need at least 1/4 of memory for an 8.1.x or  
 newer server.

Is this the new common wisdom? It looks like at some point, someone here
said oh, and it looks like you're better off using large values here for
8.1.x and newer, and now everybody seems to repeat it as if it was always
well-known.

Are there any real benchmarks out there that we can point to? And, if you set
shared_buffers to half of the available memory, won't the kernel cache
duplicate more or less exactly the same data? (At least that's what people
used to say around here, but I guess the kernel cache gets adapted to the
fact that Postgres won't ask for the most common stuff, ie. the one in the
shared buffer cache.)

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [PERFORM] URGENT: Out of disk space pg_xlog

2006-12-22 Thread Steinar H. Gunderson
On Fri, Dec 22, 2006 at 07:47:05PM +0100, ohp@pyrenet.fr wrote:
 No, pg_xlog can be truncated as soon as a checkpoint occurs.
 Even for currently running transactions ?

Isn't that the entire point of having checkpoints in the first place? :-)

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] GROUP BY vs DISTINCT

2006-12-20 Thread Steinar H. Gunderson
On Tue, Dec 19, 2006 at 11:19:39PM -0800, Brian Herlihy wrote:
 Actually, I think I answered my own question already.  But I want to
 confirm - Is the GROUP BY faster because it doesn't have to sort results,
 whereas DISTINCT must produce sorted results?  This wasn't clear to me from
 the documentation.  If it's true, then I could save considerable time by
 using GROUP BY where I have been using DISTINCT in the past.  Usually I
 simply want a count of the distinct values, and there is no need to sort
 for that.

You are right; at the moment, GROUP BY is more intelligent than DISTINCT,
even if they have to compare the same columns. This is, as always, something
that could be improved in a future release, TTBOMK.

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] New to PostgreSQL, performance considerations

2006-12-16 Thread Steinar H. Gunderson
On Sat, Dec 16, 2006 at 10:53:21AM -0500, Ron wrote:
 AFAICT, no one has stated there would be a blow-your-socks-off 
 dramatic performance improvement for pg due to compilation 
 options.  Just that there might be some, and there might be some that 
 are arch specific.

FWIW, the original claim was: It's really important to have your GLIBC
compiled for your processor.  It is essencial for performance.

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] Scaling concerns

2006-12-16 Thread Steinar H. Gunderson
On Sat, Dec 16, 2006 at 11:26:02AM -0600, tsuraan wrote:
 Even an operation like select count(*) from messages can take minutes,
 with a totally idle system.  Postgres seems to be the most scalable Free
 database out there, so I must be doing something wrong.

Unqualified SELECT COUNT(*) FROM foo is one of the most expensive operations
you can do on your system, since the visibility information has to be checked
on disk for each row. Instead, try real queries on real data, and post here
if some are too slow for you.

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] New to PostgreSQL, performance considerations

2006-12-15 Thread Steinar H. Gunderson
On Fri, Dec 15, 2006 at 10:53:25AM +0100, Alexander Staubo wrote:
 The difference is very slight. I'm going to run without -funroll- 
 loops and -pipe (which are not arch-related) to get better data.

-pipe does not matter for the generated code; it only affects compiler speed.
(It simply means that the compiler runs cpp | cc | as1 instead of cpp  tmp;
cc  tmp  tmp2; as1  tmp2.)

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] New to PostgreSQL, performance considerations

2006-12-12 Thread Steinar H. Gunderson
On Tue, Dec 12, 2006 at 01:42:06PM +0100, Cosimo Streppone wrote:
 -O0 ~ 957 tps
 -O1 -mcpu=pentium4 -mtune=pentium4 ~ 1186 tps
 -O2 -mcpu=pentium4 -mtune=pentium4 ~ 1229 tps
 -O3 -mcpu=pentium4 -mtune=pentium4 ~ 1257 tps
 -O6 -mcpu=pentium4 -mtune=pentium4 ~ 1254 tps

For the record, -O3 = -O6 for regular gcc. It used to matter for pgcc, but
that is hardly in use anymore.

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [PERFORM] New to PostgreSQL, performance considerations

2006-12-11 Thread Steinar H. Gunderson
On Mon, Dec 11, 2006 at 09:05:56AM -0200, Daniel van Ham Colchete wrote:
 But, trust me on this one. It's worth it.

You know what? I don't.

 Think of this: PostgreSQL and GNU LibC use a lot of complex algorithms:
 btree, hashes, checksums, strings functions, etc... And you have a lot of
 ways to compile it into binary code. Now you have Pentium4's vectorization
 that allow you to run plenty of instructions in paralell, but AMD doesn't
 have this. Intel also have SSE2 that makes double-precision floatpoint
 operations a lot faster, AMD also doesn't have this (at least on 32bits). 

Athlon 64 has SSE2, also in 32-bit-mode.

Of course, it doesn't really matter, since at the instant you hit the disk
even once, it's going to take a million cycles and any advantage you got from
saving single cycles is irrelevant.

 Imagine that you are GCC and that you have two options in front of
 you: you can use FSQRT or FDIV plus 20 ADD/SUB.

Could you please describe a reasonable case where GCC would have such an
option? I cannot imagine any.

 An example that I know of: it's impossible to run my software at a
 high demanding customer without compiling it to the it's processor (I
 make 5 compilations on every release).

What's your software? How can you make such assertions without backing them
up? How can you know that the same holds for PostgreSQL?

As Mike said, point to the benchmarks showing this essential difference
between -O2 and -O2 -mcpu=pentium4 (or whatever). The only single worthwhile
difference I can think of, is that glibc can use the SYSENTER function if it
knows you have a 686 or higher (which includes AMD), and with recent kernels,
I'm not even sure if that is needed anymore.

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] New to PostgreSQL, performance considerations

2006-12-11 Thread Steinar H. Gunderson
On Mon, Dec 11, 2006 at 11:09:13AM -0200, Daniel van Ham Colchete wrote:
 You know what? I don't.
 So test it yourself.

You're making the claims, you're supposed to be proving them...

 As I said, it is an example. Take floatpoint divisions. You have
 plenty of ways of doing it: 387, MMX, SSE, 3dNow, etc... Here GCC have
 to make a choice. 

No, you don't. MMX, SSE and 3Dnow! will all give you the wrong result
(reduced precision). SSE2, on the other hand, has double precision floats, so
you might have a choice there -- except that PostgreSQL doesn't really do a
lot of floating-point anyhow.

 And this is only one case. Usually, compiler optimizations are really
 complex and the processor's timings counts a lot.

You keep asserting this, with no good backing.

 If you still can't imagine any case, you can read Intel's assembler
 reference. You'll see that there are a lot of ways of doing a lot of
 things.

I've been programming x86 assembler for ten years or so...

 Steinar, you should really test it. I won't read the PostgreSQL source
 to point you were it could use SSE or SSE2 or whatever. And I won't
 read glibc's code.

Then you should stop making these sort of wild claims.

 You don't need to belive in what I'm saying. You can read GCC docs,
 Intel's assembler reference, AMD's docs about their processor and
 about how diferent that arch is.

I have.

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] New to PostgreSQL, performance considerations

2006-12-11 Thread Steinar H. Gunderson
On Mon, Dec 11, 2006 at 11:17:06AM -0200, Daniel van Ham Colchete wrote:
 I just remebered one case with MySQL. When I changed the distro from
 Conectiva 10 (rpm-based ended brazilian distro) to Gentoo, a MySQL
 operation that usually took 2 minutes to run, ended in 47 seconds.

How do you know that this improvement had _anything_ to do with the use of
different optimization flags? Were even the MySQL versions or configuration
the same?

 This is absolutely vage.

Indeed it is.

 I don't have how to prove it to you.

No, but you should stop making this sort of absolutely essential claims if
you can't.

 And I can't mesure how each factor helped: compiling glibc and Mysql with
 good cflags, rebuilding my database in a ordered way, never kernel, etc..

Exactly. So why are you attributing it to the first factor only? And why do
you think this would carry over to PostgreSQL?

Remember, anecdotal evidence isn't.

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] New to PostgreSQL, performance considerations

2006-12-11 Thread Steinar H. Gunderson
On Mon, Dec 11, 2006 at 11:31:48AM -0200, Daniel van Ham Colchete wrote:
 What PostgreSQL benchmark software should I use???

Look up the list archives; search for TPC.

 I'll test PostgreSQL 8.1 on a Fedora Core 6 and on a Gentoo. I'll get
 the same version FC6 uses and install it at my Gentoo. I'll use the
 same hardware (diferent partitions to each).

Why do you want to compare FC6 and Gentoo? Wasn't your point that the -march=
was supposed to be the relevant factor here? In that case, you want to keep
all other things equal; so use the same distribution, only with -O2
-march=i686 vs. -march=athlon-xp (or whatever).

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Locking in PostgreSQL?

2006-12-06 Thread Steinar H. Gunderson
On Wed, Dec 06, 2006 at 08:26:45AM -0500, Dave Cramer wrote:
 Apparently I've completely misunderstood MVCC then My  
 understanding is that unless you do a select ... for update then  
 update the rows will not be locked .

The discussion was about updates, not selects. Selects do not in general lock
(except for ... for update, as you say).

To (partially) answer the original question: The number of columns updated
does not matter for the locking situation.

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] File Systems Compared

2006-12-06 Thread Steinar H. Gunderson
On Wed, Dec 06, 2006 at 05:31:01PM +0100, Markus Schiltknecht wrote:
 Care to post these numbers *without* word wrapping? Thanks.
 How is one supposed to do that? Care giving an example?

This is a rather long sentence without any kind of word wrapping except what 
would be imposed on your own side -- how to set that up properly depends on the 
sending e-mail client, but in mine it's just a matter of turning off the word 
wrapping in your editor :-)

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 6: explain analyze is your friend


[PERFORM] [offtopic] Word wrapping

2006-12-06 Thread Steinar H. Gunderson
On Wed, Dec 06, 2006 at 06:45:56PM +0100, Markus Schiltknecht wrote:
 Cool, thank you for the example :-)  I thought the MTA or at least the the 
 mailing list would wrap mails at some limit. I've now set word-wrap to  
 characters (it seems not possible to turn it off completely in 
 thunderbird). But when writing, I'm now getting one long line.

Thunderbird uses format=flowed, so it's wrapped nevertheless. Google to find
out how to turn it off if you really need to.

 What's common practice?

Usually 72 or 76 characters, TTBOMK -- but when posting tables or big query
plans, one should simply turn it off, as it kills readability.

 What's it on the pgsql mailing lists?

No idea. :-)

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [PERFORM] Performance of ORDER BY

2006-12-05 Thread Steinar H. Gunderson
On Tue, Dec 05, 2006 at 01:02:06PM -0500, Tom Lane wrote:
 In 8.0 that might be counterproductively high --- we have seen cases
 where more sort_mem = slower with the older sorting code.  I concur
 with Luke's advice that you should update to 8.2 (not 8.1) to get the
 improved sorting code.

By the way, is the new sorting code any better for platforms that already
have a decent qsort() (like Linux)?

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] When to vacuum a table?

2006-11-26 Thread Steinar H. Gunderson
On Sun, Nov 26, 2006 at 09:43:11AM -0300, Marcelo Costa wrote:
 All its tables constantly manipulated (INSERT, UPDATE, DELETE) they need a
 VACUUM

Just a minor clarification here: INSERT does not create dead rows, only
UPDATE and DELETE do. Thus, if you only insert rows, you do not need to
vacuum (although you probably need to analyze).

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] PostgreSQL underestimates sorting

2006-11-22 Thread Steinar H. Gunderson
On Wed, Nov 22, 2006 at 11:17:23AM +0100, Markus Schaber wrote:
 The Backend allocates gigs of memory (we've set sort_mem to 1 gig), and
 then starts spilling out more Gigs of temporary data to the disk.

How much RAM is in the server? Remember that sort_mem is _per sort_, so if
you have multiple sorts, it might allocate several multiples of the amount
you set up.

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Hundreds of database and FSM

2006-11-15 Thread Steinar H. Gunderson
On Wed, Nov 15, 2006 at 02:31:45PM -0300, Alvaro Herrera wrote:
 This is based on our current 150 databases times 20 tables, or 3000 tables 
 total.  But I wasn't sure if sequences count as relations, which would 
 double the number.
 They don't because they don't have free space.

OTOH, indexes do.

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Strange plan in pg 8.1.0

2006-10-30 Thread Steinar H. Gunderson
On Mon, Oct 30, 2006 at 01:05:07PM +0200, Mattias Kregert wrote:
  -  Hash Left Join  (cost=59.66..206763.11 rows=1215336 width=27) 
 (actual time=4.959..3228.550 rows=1216434 loops=1)
Hash Cond: ((outer.prislista = (inner.listid)::text) AND 
 (outer.tjanst = (inner.tjanstid)::text))

Note the conversion to text here. Are you sure the types are matching on both
sides of the join?

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] Strange plan in pg 8.1.0

2006-10-30 Thread Steinar H. Gunderson
On Mon, Oct 30, 2006 at 03:26:09PM +0100, Mattias Kregert wrote:
 On the left side it is text, and on the right side it is varchar(10).
 Casting left side to varchar(10) does not help, in fact it makes things 
 even worse: The cast to ::text vanishes in a puff of logic, but the plan 
 gets bigger and even slower (20-25 seconds).

Casting definitely won't help it any; it was more a question of having the
types in the _tables_ be the same.

Anyhow, this might be a red herring; others might have something more
intelligent to say in this matter.

By the way, does it use an index scan if you turn off sequential scans
(set enable_seqscan = false)?

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Simple join optimized badly?

2006-10-10 Thread Steinar H. Gunderson
On Tue, Oct 10, 2006 at 09:07:03AM -0500, Jim C. Nasby wrote:
 Would you put something from the obfuscated C contest
 into production with comments describing what it does,

If nothing else, it would be a nice practical joke =)

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] odd variances in count(*) times

2006-10-09 Thread Steinar H. Gunderson
On Mon, Oct 09, 2006 at 02:41:07PM -0400, Merlin Moncure wrote:
 that was it. amd system now drop to .3 seconds, windows .6. (doing
 time foo  psql -c bar  file). thanks...

What you want is probably \timing in psql, by the way. :-)

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] RAID 0 not as fast as expected

2006-09-16 Thread Steinar H. Gunderson
On Sat, Sep 16, 2006 at 04:46:04PM -0700, Luke Lonergan wrote:
 Yes.  What's pretty large?  We've had to redefine large recently, now we're
 talking about systems with between 100TB and 1,000TB.

Do you actually have PostgreSQL databases in that size range?

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] sql-bench

2006-09-14 Thread Steinar H. Gunderson
On Fri, Sep 15, 2006 at 02:11:23AM +0200, Grega Bremec wrote:
 Just a point (I've taught some MySQL courses before, sorry 'bout that;
 if you're not, I am, sort of :)) - the crash-proof version of
 transactional tables in MySQL was supposed to be the Berkeley ones, but
 (oh, the irony) they're still beta.

They are being dropped in 5.1.12 (yes, across a minor revision). From
http://dev.mysql.com/doc/refman/5.1/en/news-5-1-12.html:

  Incompatible change: Support for the BerkeleyDB (BDB) engine has been
  dropped from this release. Any existing tables that are in BDB format will
  not be readable from within MySQL from 5.1.12 or newer. You should convert
  your tables to another storage engine before upgrading to 5.1.12.

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 6: explain analyze is your friend


  1   2   3   >