[PERFORM] seq scan cache vs. index cache smackdown

2005-02-14 Thread Mark Aufflick
Hi All,
I have boiled my situation down to the following simple case: (postgres 
version 7.3)

* Query 1 is doing a sequential scan over a table (courtesy of field 
ILIKE 'foo%') and index joins to a few others
* Query 2 is doing a functional index scan over the same table 
(lower(field) LIKE 'foo%') and index joins to a few others
* neither query has an order by clause
* for the purpose of testing, both queries are designed to return the 
same result set

Obviously Q2 is faster than Q1, but if I ever run them both at the same 
time (lets say I run two of Q1 and one of Q2 at the same time) then Q2 
consistently returns WORSE times than Q1 (explain analyze confirms that 
it is using the index).

My assumption is that the sequential scan is blowing the index from any 
cache it might live in, and simultaneously stealing all the disk IO 
that is needed to access the index on disk (the table has 200,000 
rows).

If I simplify the case to not do the index joins (ie. operate on the 
one table only) the situation is not as dramatic, but similar.

My thoughts are:
1) kill the sequential scan - but unfortunately I don't have direct 
control over that code
2) change the way the server allocates/prioritizes different caches - i 
don't know enough about how postgres caches work to do this (if it's 
possible)
3) try it on postgres 7.4 - possible, but migrating the system to 7.4 
in production will be hard because the above code that I am not 
responsible for has a lot of (slightly wacky) implicit date casts
4) ask the fine people on the mailing list for other suggestions!
--
Mark Aufflick
  e  [EMAIL PROTECTED]
  w  www.pumptheory.com (work)
  w  mark.aufflick.com (personal)
  p  +61 438 700 647
  f  +61 2 9436 4737


iBurst Wireless Broadband from $34.95/month   www.platformnetworks.net
Forward undetected SPAM to:   [EMAIL PROTECTED]

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] Performance over a LAN

2004-07-23 Thread Mark Aufflick
I don't think that's the advice being looked for here - if this  
behaviour is repeatable, then there is something askew with the inet  
protocol.

What is the client application written in? Do you know what version of  
the postgres network protocol your driver code is using? Are the  
inserts inside a transaction?

I'm very interested in this issue since the environment I now work in  
has a lot of network connected databases and the performance is much  
less than I am used to with local databases.

Mark.
--  
Mark Aufflick
  e  [EMAIL PROTECTED]
  w  www.pumptheory.com (work)
  w  mark.aufflick.com (personal)
  p  +61 438 700 647
On 23/07/2004, at 4:02 PM, Christopher Kings-Lynne wrote:

But with the server running on one machine and the client running on
another, the two machines being connected by a 100 Mb ethernet, with  
nothing
else on the network, this test takes 17 minutes to run. I have tried
changing the frequency of COMMIT operations, but with only a small  
effect.
Are you using separate INSERT statements?  Try using COPY instead,  
it's much faster.

chris
---(end of  
broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly

=== 
=
Pain free spam  virus protection by:  www.mailsecurity.net.au
Forward undetected SPAM to:   [EMAIL PROTECTED]
=== 
=



Pain free spam  virus protection by:  www.mailsecurity.net.au
Forward undetected SPAM to:   [EMAIL PROTECTED]

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match


Re: [PERFORM] hardware raid suggestions

2004-07-15 Thread Mark Aufflick
Not sure what your hw platform is, but I always used to get fantastic  
performance from Compaq Smart Array battery backed cards. Note that I  
haven't bought any recently so HP may have hp invent-ed them...

But whatever the brand - if you get a swag of battery backed cache you  
won't know yourself. It's fun to install an OS on them as well - watch  
the drive format and verify take 10 seconds ;)

Another option to look at is outboard raid boxes that present a single  
drive interface to the server - I know people  who swear by them.
--  
Mark Aufflick
  e  [EMAIL PROTECTED]
  w  www.pumptheory.com (work)
  w  mark.aufflick.com (personal)
  p  +61 438 700 647
On 16/07/2004, at 4:07 AM, Brian Hirt wrote:

I've  been using the adaptec ZCR raid cards in our servers for a while  
now, mostly small systems with 3 or 6 disks, and we've been very happy  
with them.   However, we're building a new DB machine with 14 U320 15K  
SCA drives, and we've run into a performance bottlenkeck with the ZCR  
card where it just won't scale well.  Without going into too many  
details, we've tested RAID5, RAID10 and RAID50 on pretty much every  
array size from 4-14 disks (raid 50 tests used more drives), using  
JFS, reiserfs and EXT3.  With every different configuration,   
performance didn't improve after array size became greater than 6  
disks..   We used various benchmarks, including pgbench with scale  
factors of 10, 100, 1000, 5000 and clients of 10, 15, 30 and 45.   
We've done many other tests and monitoring tools, and we've come to  
the conclusion that the ZCR is the problem.

We're looking into getting an Adaptec 2200S or the Megaraid 320 2x  
which have better processors, and hopefully better performance.  We  
feel that the use of the AIC7930 as the CPU on the ZCR  just doesn't  
cut it and a faster raid controller would work better. Does anyone out  
there have any experience with these cards with postgresql and linux?   
If so, would you be willing to share your experiences and possibly  
give a recommendation?

--brian
---(end of  
broadcast)---
TIP 8: explain analyze is your friend

=== 
=
Pain free spam  virus protection by:  www.mailsecurity.net.au
Forward undetected SPAM to:   [EMAIL PROTECTED]
=== 
=



Pain free spam  virus protection by:  www.mailsecurity.net.au
Forward undetected SPAM to:   [EMAIL PROTECTED]

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faqs/FAQ.html


[PERFORM] PREPAREing statements versus compiling PLPGSQL

2004-06-07 Thread Mark Aufflick
Hi all,
I am optimizing some code that does a lot of iterative selects and 
inserts within loops. Because of the exception handling limitations in 
postgres and with no ability to twiddle autocommit, just about every 
operation is standalone.

over 5000 odd lines this gets very slow (5-10 minutes including processing).
In seeking to speed it up I am PREPARing the most common inserts and 
selects. I have a few operations already inside plpgsql functions. 
EXECUTE means something different within a plpgsql funtion, so I am 
wondering if there is a way to execute a pre-prepared query inside a 
function.

Or is this even necessary - are queries within plpgsql functions 
automatically prepared when the function is first compiled? On a similar 
note, is there any benefit in PREPAREing a select from a plpgsql function?

Or does anyone have any smart ways to turn off autocommit? (I have 
already played with commit_delay and commit_siblings).

My empirical testing has proven inconclusive (other than turning off 
fsync which makes a huge difference, but not possible on the live 
system, or using a fat copmaq raid card).

Thanks for any help,
Mark.
--
Mark Aufflick
e: [EMAIL PROTECTED]
w: www.pumptheory.com (business)
w: mark.aufflick.com  (personal)
p: +61 438 700 647
---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send unregister YourEmailAddressHere to [EMAIL PROTECTED])