Re: [PERFORM] 7.3 vs 7.4 performance

2004-02-05 Thread Chris Trawick
On Thu, 2004-02-05 at 00:32, Christopher Browne wrote:
> > Things of note that might matter: the machine is a dual Opteron
> > 1.4GHz running Fedora Core 1 Test 1 for X86_64.  The 7.3.4 was from
> > the Fedora distro and the 7.4.1 was the PGDG package.  The database
> > is 3.5 Gigs with 10 millions rows and the machine had 1 Gig or ram.
> >
> > Oh... as a side note I'm happy to announce that the 2.6 Linux kernel
> > has more than DOUBLED the speed of all my Postgres queries over the
> > 2.4. =)
> 
> I did some heavy-transaction-oriented tests recently on somewhat
> heftier quad-Xeon hardware, and found little difference between 2.4
> and 2.6, and a small-but-quite-repeatable advantage with FreeBSD 4.9.
> Now, I'm quite sure my load was rather different from yours, but I
> find the claim of doubling of speed rather surprising.

I don't.  I got a similar boost out of 2.6 when dealing with extreme
concurrency.  Then again, I also got a similar boost out of 7.4.  The
two together tickled my bank account.  ;)

One question though...  It sounds like your 7.3 binaries are 64-bit and
your 7.4 binaries are 32-bit.  Have you tried grabbing the SRPM for 7.4
and recompiling it for X86_64?

chris

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] 7.3 vs 7.4 performance

2004-02-05 Thread Carlos Eduardo Smanioto
> I did some heavy-transaction-oriented tests recently on somewhat
> heftier quad-Xeon hardware, and found little difference between 2.4
> and 2.6, and a small-but-quite-repeatable advantage with FreeBSD 4.9.
> Now, I'm quite sure my load was rather different from yours, but I
> find the claim of doubling of speed rather surprising.
> --

What's the type of File System you used in the Linux? I am wanting to know
which is the operational system better for PostgreSQL: FreeBSD versus Linux
2.6.

Thanks.

[]'s
Carlos Eduardo Smanioto (Brazil)

- Original Message -
From: "Christopher Browne" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Thursday, February 05, 2004 3:32 AM
Subject: Re: [PERFORM] 7.3 vs 7.4 performance


> Oops! [EMAIL PROTECTED] (Orion Henry) was seen spray-painting on a
wall:
> > I've done some testing of 7.3.4 vs 7.4.1 and found 7.4.1 to be 20%-30%
> > slower than 7.3.4.  Is this common knowledge or am I just unlucky with
> > my query/data selection?
>
> That seems unusual; the opposite seems more typical in view of there
> being some substantial improvements to the query optimizer.
>
> Have you tried doing EXPLAIN ANALYZE on the queries on both sides?
> There would doubtless be interest in figuring out what is breaking
> down...
>
> > Things of note that might matter: the machine is a dual Opteron
> > 1.4GHz running Fedora Core 1 Test 1 for X86_64.  The 7.3.4 was from
> > the Fedora distro and the 7.4.1 was the PGDG package.  The database
> > is 3.5 Gigs with 10 millions rows and the machine had 1 Gig or ram.
> >
> > Oh... as a side note I'm happy to announce that the 2.6 Linux kernel
> > has more than DOUBLED the speed of all my Postgres queries over the
> > 2.4. =)
>
> I did some heavy-transaction-oriented tests recently on somewhat
> heftier quad-Xeon hardware, and found little difference between 2.4
> and 2.6, and a small-but-quite-repeatable advantage with FreeBSD 4.9.
> Now, I'm quite sure my load was rather different from yours, but I
> find the claim of doubling of speed rather surprising.
> --
> (format nil "[EMAIL PROTECTED]" "aa454" "freenet.carleton.ca")
> http://www.ntlug.org/~cbbrowne/spiritual.html
> Failure is not an option. It comes bundled with your Microsoft product.
>
> ---(end of broadcast)---
> TIP 5: Have you checked our extensive FAQ?
>
>http://www.postgresql.org/docs/faqs/FAQ.html


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


[PERFORM] Index Performance Help

2004-02-05 Thread Damien Dougan
Hi All,

I've been seeing very slow read performance on a database of 1 million
indexed subscribers, which I believe is nothing to do with the data
itself, but delays on processing the index. 

If I make a random jump into the index (say X), it can take about 50ms
to read the subscriber. If I then make a "close by" lookup (say X+10),
it takes only about 0.5ms to read the subscriber. Making another lookup
to a "far away" (say X+1000), it again takes about 50ms to read.

>From the analyze output, it looks like most of the work is being done in
the index scan of the subscriber table - reading the actual data from
the PublicView is quite fast.

Am I correct in my analysis? Is there anything I can do to improve the
performance of the index lookups? 

(The indexes in question are all created as B-TREE.)

I've tried increasing the index memory and making a number of queries
around the index range, but a stray of several hundred indexes from a
cached entry always results in a major lookup delay.

I've also increased the shared memory available to Postgres to 80MB
incase this is a paging of the index, but it hasn't seemed to have any
effect.




Sample analyze output for an initial query:

hydradb=# explain analyze select * from pvsubscriber where actorid =
'b3432-asdas-232-Subscriber793500';
 
QUERY PLAN




 Nested Loop Left Join  (cost=0.00..13.19 rows=1 width=100) (actual
time=49.688..49.699 rows=1 loops=1)
   ->  Nested Loop Left Join  (cost=0.00..10.16 rows=1 width=69) (actual
time=49.679..49.689 rows=1 loops=1)
 Join Filter: ("inner".mc_childactor_id = "outer".id)
 ->  Nested Loop  (cost=0.00..10.15 rows=1 width=69) (actual
time=49.669..49.677 rows=1 loops=1)
   ->  Nested Loop  (cost=0.00..7.12 rows=1 width=73)
(actual time=43.969..43.974 rows=1 loops=1)
 ->  Index Scan using mc_actor_key on mc_actor
(cost=0.00..4.08 rows=1 width=69) (actual time=39.497..39.499 rows=1
loops=1)
   Index Cond: ((actorid)::text =
'b3432-asdas-232-Subscriber793500'::text)
 ->  Index Scan using rel_actor_has_subscriber_idx1
on rel_actor_has_subscriber rel_sub  (cost=0.00..3.02 rows=1 width=8)
(actual time=4.458..4.460 rows=1 loops=1)
   Index Cond: ("outer".id =
rel_sub.mc_actor_id)
   ->  Index Scan using mc_subscriber_id_idx on
mc_subscriber sub  (cost=0.00..3.02 rows=1 width=4) (actual
time=5.689..5.691 rows=1 loops=1)
 Index Cond: (sub.id = "outer".mc_subscriber_id)
 ->  Seq Scan on rel_actor_has_actor rel_parent
(cost=0.00..0.00 rows=1 width=8) (actual time=0.002..0.002 rows=0
loops=1)
   ->  Index Scan using mc_actor_id_idx on mc_actor  (cost=0.00..3.02
rows=1 width=39) (actual time=0.002..0.002 rows=0 loops=1)
 Index Cond: ("outer".mc_parentactor_id = mc_actor.id)
 Total runtime: 49.845 ms
(15 rows)



And the analyze output for a "nearby" subscriber (10 indexes away):

hydradb=# explain analyze select * from pvsubscriber where actorid =
'b3432-asdas-232-Subscriber793510';
 
QUERY PLAN




 Nested Loop Left Join  (cost=0.00..13.19 rows=1 width=100) (actual
time=0.278..0.288 rows=1 loops=1)
   ->  Nested Loop Left Join  (cost=0.00..10.16 rows=1 width=69) (actual
time=0.271..0.280 rows=1 loops=1)
 Join Filter: ("inner".mc_childactor_id = "outer".id)
 ->  Nested Loop  (cost=0.00..10.15 rows=1 width=69) (actual
time=0.264..0.272 rows=1 loops=1)
   ->  Nested Loop  (cost=0.00..7.12 rows=1 width=73)
(actual time=0.246..0.251 rows=1 loops=1)
 ->  Index Scan using mc_actor_key on mc_actor
(cost=0.00..4.08 rows=1 width=69) (actual time=0.220..0.221 rows=1
loops=1)
   Index Cond: ((actorid)::text =
'b3432-asdas-232-Subscriber793510'::text)
 ->  Index Scan using rel_actor_has_subscriber_idx1
on rel_actor_has_subscriber rel_sub  (cost=0.00..3.02 rows=1 width=8)
(actual time=0.017..0.018 rows=1 loops=1)
   Index Cond: ("outer".id =
rel_sub.mc_actor_id)
   ->  Index Scan using mc_subscriber_id_idx on
mc_subscriber sub  (cost=0.00..3.02 rows=1 width=4) (actual
time=0.012..0.013 rows=1 loops=1)
 Index Cond: (sub.id = "outer".mc_subscriber_id)
 ->  Seq Scan on rel_actor_has_actor rel_parent
(cost=0.00..0.00 rows=1 width=8) (actual time=0.002..0.002 rows=0
loops=1)
   ->  Index Scan using mc_actor_id_idx on mc_actor  (cost=0.00..3.02
rows=1 width=39) (actual time=0.001..0.001 rows=0 loops=1)
 Index Cond: ("outer".mc_parentactor_id = mc_actor.id)
 Total runt

Re: [PERFORM] 7.3 vs 7.4 performance

2004-02-05 Thread Christopher Browne
In an attempt to throw the authorities off his trail, [EMAIL PROTECTED] ("Carlos 
Eduardo Smanioto") transmitted:
>> I did some heavy-transaction-oriented tests recently on somewhat
>> heftier quad-Xeon hardware, and found little difference between 2.4
>> and 2.6, and a small-but-quite-repeatable advantage with FreeBSD
>> 4.9.  Now, I'm quite sure my load was rather different from yours,
>> but I find the claim of doubling of speed rather surprising.
>
> What's the type of File System you used in the Linux? I am wanting
> to know which is the operational system better for PostgreSQL:
> FreeBSD versus Linux 2.6.

On the Linux box in question, I was using JFS, which has had the mixed
reviews, lately, that on the one hand, it _appears_ to be a tad faster
than all the others, but that has been, on the other hand, associated
with systems hanging up and crashing, under load.

The latter bit is a _really_ big caveat.  On that particular machine,
I have a nicely repeatable "test case" where I can do a particular set
of "system load" that consistently takes the system down, to the point
of having to hit the "big red button."  If I could point to a clear
reason why it happens, I'd be a much happier camper.  As it stands, it
is a bit nebulous whether the problem is:
 a) Hardware drivers,
 b) Flakey hardware (which Linux 2.6.1 copes with a lot better than
2.4!),
 c) Flakey 2.4 kernel,
 d) Problem with JFS,
 e) Something else not yet identified as a plausible cause.

If I could say, "Oh, it's an identified bug in the Frobozz RAID
controller drivers, and was fixed in 2.6.0-pre-17", that would help
allay the suspicion that the problem could be any of the above.
-- 
let name="aa454" and tld="freenet.carleton.ca" in name ^ "@" ^ tld;;
http://www.ntlug.org/~cbbrowne/
"Another result of the tyranny of Pascal is that beginners don't use
function pointers."  --Rob Pike

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] 7.3 vs 7.4 performance

2004-02-05 Thread Tom Lane
Josh Berkus <[EMAIL PROTECTED]> writes:
> I'm wondering if we need specific compile-time switches for Opteron.   I know
> we got Opteron code tweaks in the last version,

Not in 7.4.  There is some marginal hacking in the spinlock code in CVS
tip for multi-CPU i386 and x86_64 (viz, add a PAUSE instruction inside
the wait loop) but I'm not sure that will have any significance in real
life.

regards, tom lane

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] Index Performance Help

2004-02-05 Thread Richard Huxton
On Thursday 05 February 2004 12:13, Damien Dougan wrote:
> Hi All,
>
> I've been seeing very slow read performance on a database of 1 million
> indexed subscribers, which I believe is nothing to do with the data
> itself, but delays on processing the index.
>
> If I make a random jump into the index (say X), it can take about 50ms
> to read the subscriber. If I then make a "close by" lookup (say X+10),
> it takes only about 0.5ms to read the subscriber. Making another lookup
> to a "far away" (say X+1000), it again takes about 50ms to read.

The first time, it has to fetch a block from disk. The second time that disk 
block is already in RAM so it's much faster. The third time it needs a 
different disk block.

> Am I correct in my analysis? Is there anything I can do to improve the
> performance of the index lookups?

Make sure you have enough RAM to buffer your disks. Buy faster disks.

> I've tried increasing the index memory and making a number of queries
> around the index range, but a stray of several hundred indexes from a
> cached entry always results in a major lookup delay.

Yep, that'll be your disks.

> I've also increased the shared memory available to Postgres to 80MB
> incase this is a paging of the index, but it hasn't seemed to have any
> effect.

Probably the wrong thing to do (although you don't mention what hardware 
you've got). Read the tuning document at:
http://www.varlena.com/varlena/GeneralBits/Tidbits/index.php

> Sample analyze output for an initial query:
>
> hydradb=# explain analyze select * from pvsubscriber where actorid =
> 'b3432-asdas-232-Subscriber793500';
...
>  ->  Index Scan using mc_actor_key on mc_actor
> (cost=0.00..4.08 rows=1 width=69) (actual time=39.497..39.499 rows=1
> loops=1)
...
>  Total runtime: 49.845 ms

> And the analyze output for a "nearby" subscriber (10 indexes away):
>
> hydradb=# explain analyze select * from pvsubscriber where actorid =
> 'b3432-asdas-232-Subscriber793510';
>
...
>  ->  Index Scan using mc_actor_key on mc_actor
> (cost=0.00..4.08 rows=1 width=69) (actual time=0.220..0.221 rows=1
> loops=1)
>  Total runtime: 0.428 ms
> (15 rows)

That certainly seems to be the big change - the only way to consistently get 
1ms timings is going to be to make sure all your data is cached. Try the 
tuning guide above and see what difference that makes. If that's no good, 
post again with details of your config settings, hardware, number of clients 
etc...

-- 
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Index Performance Help

2004-02-05 Thread Damien Dougan
Thanks Richard.

It certainly does appear to be memory related (on a smaller data set of
250K subscribers, all accesses are < 1ms).


We're going to play with increasing RAM on the machine, and applying the
optimisation levels on the page you recommended.

(We're also running on a hardware RAID controlled SCSI set - mirrored
disks so reading should be very fast).


Cheers,

Damien



---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] select is not using index?

2004-02-05 Thread Mark Harrison
Corey Edwards wrote:

Your column is a bigint but 123 defaults to type int. Indexes aren't
used when there's a type mismatch. Use an explicit cast or quote it:
  select * from bigtable where id = 123::bigint;

Or

  select * from bigtable where id = '123';
Thanks Corey, both of these do exactly what I need...

Cheers,
Mark
--
Mark Harrison
Pixar Animation Studios
---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] 7.3 vs 7.4 performance

2004-02-05 Thread Orion Henry

> 
> One question though...  It sounds like your 7.3 binaries are 64-bit and
> your 7.4 binaries are 32-bit.  Have you tried grabbing the SRPM for 7.4
> and recompiling it for X86_64?

No, they were all 64 bit.

I'm going to run explains on all my queries and see if I can find 
anything of interest...



signature.asc
Description: This is a digitally signed message part


Re: [PERFORM] Index Performance Help

2004-02-05 Thread Josh Berkus
Damian,

Also, if there have been a lot of updates to the table, you may need to run a 
REINDEX on it.   An attenuated index would be slow to load because of the 
nummber of empty disk blocks. 

-- 
-Josh Berkus
 Aglio Database Solutions
 San Francisco


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] [pgsql-advocacy] MySQL+InnoDB vs. PostgreSQL test?

2004-02-05 Thread Mark Harrison
Jan Wieck wrote:
It might not work with the words I used above, but the point I tried to 
make is that the hardest thing you can "sell" is a "no". I mean, not 
just saying "no", but selling it in a way that the customer will not go 
with the next idiot who claims "we can do that".
But you will need some kind of data or reasoning to back up your response,
especially if it is deviating from the conventional wisdom, or from
some familiar system.
Especially in this case, it's not a "no" answer that's being sold...
it's "solution a is better than solution b, even though you might
be more familiar with solution b."
Cheers,
Mark
---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faqs/FAQ.html


[PERFORM] COPY with INDEXES question

2004-02-05 Thread Slavisa Garic
Hi,

I have a quick question. In order to speed up insertion of large number of
rows (100s of thousands) I replaced the INSERT with the COPY. This works
fine but one question popped into my mind. Does copy updates indexes on
that table if there are some defined?

Thanks,
Slavisa



---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] COPY with INDEXES question

2004-02-05 Thread Christopher Kings-Lynne
I have a quick question. In order to speed up insertion of large number of
rows (100s of thousands) I replaced the INSERT with the COPY. This works
fine but one question popped into my mind. Does copy updates indexes on
that table if there are some defined?
Yes, of course.  Runs triggers and stuff as well.

Chris

---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


Re: [PERFORM] COPY with INDEXES question

2004-02-05 Thread Slavisa Garic
Thanks for the reply and thanks even more for the good one :).

Cheers,
Slavisa

On Fri, 6 Feb 2004, Christopher Kings-Lynne wrote:

> > I have a quick question. In order to speed up insertion of large number of
> > rows (100s of thousands) I replaced the INSERT with the COPY. This works
> > fine but one question popped into my mind. Does copy updates indexes on
> > that table if there are some defined?
> 
> Yes, of course.  Runs triggers and stuff as well.
> 
> Chris
> 
> 


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings