[PERFORM] raid setup for db

2009-05-13 Thread Thomas Finneid

Hi

I am wondering what stripe size, on a raid 0, is the most suitable for 
postgres 8.2?


I read a performance tutorial by Bruce Momjian and it suggest setting 
the stripe size to the same block size (as pg uses?)

( http://momjian.us/main/writings/pgsql/hw_performance/index.html )
But I want to check whether I have understood this correctly.

Are there any other hot confguration tips I should consider or does 
anybody have any suggestions for other raid configuration articles?


regards

thomas

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] increase index performance

2009-05-13 Thread Thomas Finneid


First off, is there a  way to pre-filter some of this data, by a view, 
temporary table, partitioned indexes or something.


Secondly, one of the problems seems to be the size of the data and its 
index, how can I calulate how much space a particular part of the index 
needs in memory? maybe I could rearrange things a bit better so it 
better first inside pages and so on.


Thirdly I was a bit unclear and this was the best example I could think 
of (my client probably dont want me to talk about this at all... hence 
the contrived example):


   85 city_ids,
   2000 street_ids per city,
   10 house_ids per street
   500 floor_ids per house

Now it should have the correct data distribution and the correct 
cardinality.


In this particular query I am interested in all streets in a city that 
have the specific house id and the specific floor id.


By specifying
city_id, house_id and floor_id

I should get all street_ids that matches

The example you gave Greg assumed I wanted to follow cardinality, but I 
need to skip the second field in order to get the right query. So 
pulling the data based on the two first fields, City and Street would 
just give me data for a single street, when I want it for all streets.










Greg Smith wrote:

On Tue, 12 May 2009, Thomas Finneid wrote:

on a database with 260 GB of data and an index size of 109GB on 
separate raid disks. there are

85 city_ids, 2000
street_ids per city,
20 house_ids per street per city
5 floor_ids per house_ per street per city


You should test what happens if you reduce the index to just being 
(city_id,street_id).  Having all the fields in there makes the index 
larger, and it may end up being faster to just pull all of the ~100 data 
rows for a particular (city_id,street_id) using the smaller index and 
then filter out just the ones you need.  Having a smaller index to 
traverse also means that you'll be more likely to keep all the index 
blocks in the buffer cache moving forward.


A second level improvement there is to then CLUSTER on the smaller 
index, which increases the odds you'll get all of the rows you need by 
fetching only a small number of data pages.


--
* Greg Smith gsm...@gregsmith.com http://www.gregsmith.com Baltimore, MD



--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] AMD Shanghai versus Intel Nehalem

2009-05-13 Thread Scott Marlowe
Just realized I made a mistake, I was under the impression that
Shanghai CPUs had 8xxx numbers while barcelona had 23xx numbers.  I
was wrong, it appears the 8xxx numbers are for 4+ socket servers while
the 23xx numbers are for 2 or fewer sockets.  So, there are several
quite affordable shanghai cpus out there, and many of the ones I
quoted as barcelonas are in fact shanghais with the larger 6M L2
cache.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] AMD Shanghai versus Intel Nehalem

2009-05-13 Thread Arjen van der Meijden
We have a dual E5540 with 16GB (I think 1066Mhz) memory here, but no AMD 
Shanghai. We haven't done PostgreSQL benchmarks yet, but given the 
previous experiences, PostgreSQL should be equally faster compared to mysql.


Our databasebenchmark is actually mostly a cpu/memory-benchmark. 
Comparing the results of the dual E5540 (2.53Ghz with HT enabled) to a 
dual Intel X5355 (2.6Ghz quad core two from 2007), the peek load has 
increased from somewhere between 7 and 10 concurrent clients to 
somewhere around 25, suggesting better scalable hardware. With the 25 
concurrent clients we handled 2.5 times the amount of queries/second 
compared to the 7 concurrent client-score for the X5355, both in MySQL 
5.0.41. At 7 CC we still had 1.7 times the previous result.


I'm not really sure how the shanghai cpu's compare to those older 
X5355's, the AMD's should be faster, but how much?


I've no idea if we get a Shanghai to compare it with, but we will get a 
dual X5570 soon on which we'll repeat some of the tests, so that should 
at least help a bit with scaling the X5570-results around the world down.


Best regards,

Arjen

On 12-5-2009 20:47 Scott Marlowe wrote:

Anyone on the list had a chance to benchmark the Nehalem's yet?  I'm
primarily wondering if their promise of performance from 3 memory
channels holds up under typical pgsql workloads.  I've been really
happy with the behavior of my AMD shanghai based server under heavy
loads, but if the Nehalems much touted performance increase translates
to pgsql, I'd like to know.



--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] raid setup for db

2009-05-13 Thread Rafael Martinez
Thomas Finneid wrote:
 Hi
 
 I am wondering what stripe size, on a raid 0, is the most suitable for
 postgres 8.2?
 

Hello

Raid 0 for a database? This is a disaster waiting to happen.
Are you sure you want to use raid0?

regards
-- 
 Rafael Martinez, r.m.guerr...@usit.uio.no
 Center for Information Technology Services
 University of Oslo, Norway

 PGP Public Key: http://folk.uio.no/rafael/

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] raid setup for db

2009-05-13 Thread Thomas Finneid

Dont worry about it, this is just for performance testing.

thomas

 Thomas Finneid wrote:
 Hi

 I am wondering what stripe size, on a raid 0, is the most suitable for
 postgres 8.2?


 Hello

 Raid 0 for a database? This is a disaster waiting to happen.
 Are you sure you want to use raid0?

 regards
 --
  Rafael Martinez, r.m.guerr...@usit.uio.no
  Center for Information Technology Services
  University of Oslo, Norway

  PGP Public Key: http://folk.uio.no/rafael/

 --
 Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-performance




-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Any better plan for this query?..

2009-05-13 Thread Dimitri
On 5/12/09, Robert Haas robertmh...@gmail.com wrote:
 On Tue, May 12, 2009 at 1:00 PM, Dimitri dimitrik...@gmail.com wrote:
 On MySQL there is no changes if I set the number of sessions in the
 config file to 400 or to 2000 - for 2000 it'll just allocate more
 memory.

 I don't care whether the setting affects the speed of MySQL.  I want
 to know if it affects the speed of PostgreSQL.

the problem is they both have max_connections parameter, so as you
asked for MySQL I answered for MySQL, did not test yet for PostgreSQL,
will be in the next series..


 After latest fix with default_statistics_target=5, version 8.3.7 is
 running as fast as 8.4, even 8.4 is little little bit slower.

 I understand your position with a pooler, but I also want you think
 about idea that 128 cores system will become a commodity server very
 soon, and to use these cores on their full power you'll need a
 database engine capable to run 256 users without pooler, because a
 pooler will not help you here anymore..

 So what?  People with 128-core systems will not be running trivial
 joins that return in 1-2ms and have one second think times between
 them.  And if they are, and if they have nothing better to do than
 worry about whether MySQL can process those queries in 1/2000th of the
 think time rather than 1/1000th of the think time, then they can use
 MySQL.  If we're going to worry about performance on 128-core system,
 we would be much better advised to put our efforts into parallel query
 execution than how many microseconds it takes to execute very simple
 queries.

Do you really think nowdays for example a web forum application having
PG as a backend will have queries running slower than 1-2ms to print a
thread message within your browser???  or banking transactions??


 Still, I have no problem with making PostgreSQL faster in the case
 you're describing.  I'm just not interested in doing it on my own time
 for free.  I am sure there are a number of people who read this list
 regularly who would be willing to do it for money, though.  Maybe even
 me.  :-)

 ...Robert


You don't need to believe me, but I'm doing it for free - I still have
my work to finish in parallel :-))  And on the same time I don't see
any other way to learn and improve my knowledge, but nobody is perfect
:-))

Rgds,
-Dimitri

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Timestamp index not used in some cases

2009-05-13 Thread Евгений Василев
On Tuesday 12 May 2009 12:55:14 Scott Marlowe wrote:
 On Tue, May 12, 2009 at 3:00 AM, Евгений Василев

 evasi...@jarcomputers.com wrote:
  I have the following table:
 
  CREATE TABLE temp.tmp_135528
  (
  id integer NOT NULL,
  prid integer,
  group_id integer,
  iinv integer,
  oinv integer,
  isum numeric,
  osum numeric,
  idate timestamp without time zone,
  odate timestamp without time zone,
  CONSTRAINT t_135528_pk PRIMARY KEY (id)
  )
  WITH (OIDS=FALSE);
 
  With index:
 
  CREATE INDEX t_135528
  ON temp.tmp_135528
  USING btree
  (idate, group_id, osum, oinv);
 
  When the following query is executed the index is not used:
 
  EXPLAIN SELECT id, osum
  FROM temp.tmp_135528
  WHERE idate = '2007-05-17 00:00:00'::timestamp
  AND group_id = '13'
  AND osum = '19654.45328'
  AND oinv = -1
 
  QUERY PLAN
  -
 -- Seq
  Scan on tmp_135528 (cost=0.00..7022.36 rows=1166 width=11)
  Filter: ((idate = '2007-05-17 00:00:00'::timestamp without time zone)
  AND (osum = 19654.45328) AND (group_id = 13) AND (oinv = (-1)))
  (2 rows)
 
  When
  idate = '2007-05-17 00:00:00'::timestamp
  is changed to
  idate = '2007-05-17 00:00:00'::timestamp
  or
  idate = '2007-05-17 00:00:00'::timestamp
  then the index is used:
 
  EXPLAIN SELECT id, osum
  FROM temp.tmp_135528
  WHERE idate = '2007-05-17 00:00:00'::timestamp
  AND group_id = '13'
  AND osum = '19654.45328'
  AND oinv = -1;
  QUERY PLAN
  -
 --
  Index Scan using t_135528 on tmp_135528 (cost=0.00..462.61 rows=47
  width=11) Index Cond: ((idate = '2007-05-17 00:00:00'::timestamp without
  time zone) AND (group_id = 13) AND (osum = 19654.45328) AND (oinv =
  (-1))) (2 rows)
 
  Why I cannot use the index in = comparison on timestamp ?

 You can.  But in this instance one query is returning 47 rows while
 the other is returning 1166 rows (or the planner thinks it is).
 There's a switchover point where it's cheaper to seq scan.  You can
 adjust this point up and down by adjusting various costs parameters.
 random_page_cost is commonly lowered to the 1.5 to 2.0 range, and
 effective_cache_size is normally set higher, to match the cache in the
 kernel plus the shared_buffer size.

Thank you this worked like a charm.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Any better plan for this query?..

2009-05-13 Thread Dimitri
Hi Scott,

On 5/12/09, Scott Carey sc...@richrelevance.com wrote:
 Although nobody wants to support it, he should try the patch that Jignesh K.
 Shah (from Sun) proposed that makes ProcArrayLock lighter-weight.  If it
 makes 32 cores much faster, then we have a smoking gun.

 Although everyone here is talking about this as an 'unoptimal' solution, the
 fact is there is no evidence that a connection pooler will fix the
 scalability from 16  32 cores.
 Certainly a connection pooler will help most results, but it may not fix the
 scalability problem.

 A question for Dimitri:
 What is the scalability from 16  32 cores at the 'peak' load that occurs
 near 2x the CPU count?  Is it also poor?  If this is also poor, IMO the
 community here should not be complaining about this unopimal case -- a
 connection pooler at that stage does little and prepared statements will
 increase throughput but not likely alter scalability.

I'm attaching a small graph showing a TPS level on PG 8.4 depending on
number of cores (X-axis is a number of concurrent users, Y-axis is the
TPS number). As you may see TPS increase is near linear while moving
from 8 to 16 cores, while on 32cores even it's growing slightly
differently, what is unclear is why TPS level is staying limited to
11.000 TPS on 32cores. And it's pure read-only workload.


 If that result scales, then the short term answer is a connection pooler.

 In the tests that Jingesh ran -- making the ProcArrayLock faster helped the
 case where connections = 2x the CPU core count quite a bit.

 The thread about the CPU scalability is Proposal of tunable fix for
 scalability of 8.4, originally posted by Jignesh K. Shah
 j.k.s...@sun.com, March 11 2009.

 It would be very useful to see results of this benchmark with:
 1. A Connection Pooler

will not help, as each client is *not* disconnecting/reconnecting
during the test, as well PG is keeping well even 256 users. And TPS
limit is reached already on 64 users, don't think pooler will help
here.

 2. Jignesh's patch

I've already tested it and it did not help in my case because the real
problem is elsewhere.. (however, I did not test it yet with my latest
config params)

 3. Prepared statements


yes, I'm preparing this test.

 #3 is important, because prepared statements are ideal for queries that
 perform well with low statistics_targets, and not ideal for those that
 require high statistics targets.  Realistically, an app won't have more than
 a couple dozen statement forms to prepare.  Setting the default statistics
 target to 5 is just a way to make some other query perform bad.

Agree, but as you may have a different statistic target *per* table it
should not be a problem. What is sure - all time spent on parse and
planner will be removed here, and the final time should be a pure
execution.

Rgds,
-Dimitri



 On 5/12/09 10:53 AM, Alvaro Herrera alvhe...@commandprompt.com wrote:

 Andres Freund escribió:

 Naturally it would still be nice to be good in this not optimal
 workload...

 I find it hard to justify wasting our scarce development resources into
 optimizing such a contrived workload.

 --
 Alvaro Herrera
 http://www.CommandPrompt.com/
 The PostgreSQL Company - Command Prompt, Inc.

 --
 Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-performance



attachment: Hist_coresALL_RW0.ccrnone.pgsql-8.4beta1-buf4096-tps_avg-1.gif
-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Any better plan for this query?..

2009-05-13 Thread Dimitri
I'm also confused, but seems discussion giving also other ideas :-)
But yes, each client is connecting to the database server only *once*.

To presice how the test is running:
 - 1 client is started  = 1 in total
 - sleep ...
 - 1 another client is started  = 2 in total
 - sleep ..
 - 2 another clients are started = 4 in total
 - sleep ..
 ...
 ... === 256 in total
 - sleep ...
 - kill clients

So I even able to monitor how each new client impact all others. The
test kit is quite flexible to prepare any kind of stress situations.

Rgds,
-Dimitri

On 5/12/09, Glenn Maynard glennfmayn...@gmail.com wrote:
 I'm sorry, but I'm confused.  Everyone keeps talking about connection
 pooling, but Dimitri has said repeatedly that each client makes a
 single connection and then keeps it open until the end of the test,
 not that it makes a single connection per SQL query.  Connection
 startup costs shouldn't be an issue.  Am I missing something here?
 test(N) starts N clients, each client creates a single connection and
 hammers the server for a while on that connection.  test(N) is run for
 N=1,2,4,8...256.  This seems like a very reasonable test scenario.

 --
 Glenn Maynard

 --
 Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-performance


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Any better plan for this query?..

2009-05-13 Thread Dimitri
On MySQL there is no changes if I set the number of sessions in the
config file to 400 or to 2000 - for 2000 it'll just allocate more
memory.

After latest fix with default_statistics_target=5, version 8.3.7 is
running as fast as 8.4, even 8.4 is little little bit slower.

I understand your position with a pooler, but I also want you think
about idea that 128 cores system will become a commodity server very
soon, and to use these cores on their full power you'll need a
database engine capable to run 256 users without pooler, because a
pooler will not help you here anymore..

Rgds,
-Dimitri

On 5/12/09, Robert Haas robertmh...@gmail.com wrote:
 On Tue, May 12, 2009 at 11:22 AM, Dimitri dimitrik...@gmail.com wrote:
 Robert, what I'm testing now is 256 users max. The workload is growing
 progressively from 1, 2, 4, 8 ... to 256 users. Of course the Max
 throughput is reached on the number of users equal to 2 * number of
 cores, but what's important for me here - database should continue to
 keep the workload! - response time regressing, but the troughput
 should remain near the same.

 So, do I really need a pooler to keep 256 users working??  - I don't
 think so, but please, correct me.

 Not an expert on this, but there has been a lot of discussion of the
 importance of connection pooling in this space.  Is MySQL still faster
 if you lower max_connections to a value that is closer to the number
 of users, like 400 rather than 2000?

 BTW, I did not look to put PostgreSQL in bad conditions - the test is
 the test, and as I said 2 years ago PostgreSQL outperformed MySQL on
 the same test case, and there was nothing done within MySQL code to
 improve it explicitly for db_STRESS.. And I'm staying pretty honest
 when I'm testing something.

 Yeah but it's not really clear what that something is.  I believe you
 said upthread that PG 8.4 beta 1 is faster than PG 8.3.7, but PG 8.4
 beta 1 is slower than MySQL 5.4 whereas PG 8.3.7 was faster than some
 older version of MySQL.  So PG got faster and MySQL got faster, but
 they sped things up more than we did.  If our performance were getting
 WORSE, I'd be worried about that, but the fact that they were able to
 make more improvement on this particular case than we were doesn't
 excite me very much.  Sure, I'd love it if PG were even faster than it
 is, and if you have a suggested patch please send it in...  or if you
 want to profile it and send the results that would be great too.  But
 I guess my point is that the case of a very large number of
 simultaneous users with pauses-for-thought between queries has already
 been looked at in the very recent past in a way that's very similar to
 what you are doing (and by someone who works at the same company you
 do, no less!) so I'm not quite sure why we're rehashing the issue.

 ...Robert


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] increase index performance

2009-05-13 Thread Matthew Wakeling

On Tue, 12 May 2009, Greg Smith wrote:
You should test what happens if you reduce the index to just being 
(city_id,street_id).


I think you're missing the point a little here. The point is that Thomas 
is creating an index on (city_id, street_id, house_id, floor_id) and 
running a query on (city_id, house_id, floor_id).


Thomas, the order of columns in the index matters. The index is basically 
a tree structure, which resolves the left-most column before resolving the 
column to the right of it. So to answer your query, it will resolve the 
city_id, then it will have to scan almost all of the tree under that, 
because you are not constraining for street_id. A much better index to 
answer your query is (city_id, house_id, floor_id) - then it can just look 
up straight away. Instead of the index returning 20 rows to check, it 
will return just the 2000.


Matthew

--
An ant doesn't have a lot of processing power available to it. I'm not trying
to be speciesist - I wouldn't want to detract you from such a wonderful
creature, but, well, there isn't a lot there, is there?
   -- Computer Science Lecturer

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] PostgreSQL with PostGIS on embedded hardware

2009-05-13 Thread Stefan Kaltenbrunner

Greg Stark wrote:

On Mon, May 11, 2009 at 5:05 PM, Stefan Kaltenbrunner
ste...@kaltenbrunner.cc wrote:

Good to know!!! I imagine that on a PS3 it would be _really_ fast... :-)

well not really - while it is fairly easy to get postgresql running on a PS3
it is not a fast platform. While the main CPU there is a pretty fast Power
based core it only has 256MB of Ram and a single SATA disk available(though
you could add some USB disks).


The nice thing about it is that TPC-C and other benchmarks all specify
their bottom-line number in some unit like Transaction per second PER
DOLLAR. So using a PS3 should be able to get ridiculously good results
compared to expensive server hardware...


I kinda doubt that - the PS3 is certainly not server grade hardware so 
you can only compare it to a desktop and I would bet that the typical 
desktop you get for the 400€(you can get 4GB RAM a quadcore CPU for 
that) price of a PS3 is going to outperform it significantly for almost 
every workload...



Stefan


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Any better plan for this query?..

2009-05-13 Thread Robert Haas
On Tue, May 12, 2009 at 11:18 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Matthew Wakeling matt...@flymine.org writes:
 On Tue, 12 May 2009, Simon Riggs wrote:
 No, we spawn then authenticate.

 But you still have a single thread doing the accept() and spawn. At some
 point (maybe not now, but in the future) this could become a bottleneck
 given very short-lived connections.

 More to the point, each backend process is a pretty heavyweight object:
 it is a process, not a thread, and it's not going to be good for much
 until it's built up a reasonable amount of stuff in its private caches.
 I don't think the small number of cycles executed in the postmaster
 process amount to anything at all compared to the other overhead
 involved in getting a backend going.

AIUI, whenever the connection pooler switches to serving a new client,
it tells the PG backend to DISCARD ALL.  But why couldn't we just
implement this same logic internally?  IOW, when a client disconnects,
instead of having the backend exit immediately, have it perform the
equivalent of DISCARD ALL and then stick around for a minute or two
and, if a new connection request arrives within that time, have the
old backend handle the new connection...

(There is the problem of how to get the file descriptor returned by
the accept() call in the parent process down to the child... but I
think that at least on some UNIXen there is a way to pass an fd
through a socket, or even dup it into another process by opening it
from /proc/fd)

...Robert

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] PostgreSQL with PostGIS on embedded hardware

2009-05-13 Thread Greg Stark
On Mon, May 11, 2009 at 5:05 PM, Stefan Kaltenbrunner
ste...@kaltenbrunner.cc wrote:
 Good to know!!! I imagine that on a PS3 it would be _really_ fast... :-)

 well not really - while it is fairly easy to get postgresql running on a PS3
 it is not a fast platform. While the main CPU there is a pretty fast Power
 based core it only has 256MB of Ram and a single SATA disk available(though
 you could add some USB disks).

The nice thing about it is that TPC-C and other benchmarks all specify
their bottom-line number in some unit like Transaction per second PER
DOLLAR. So using a PS3 should be able to get ridiculously good results
compared to expensive server hardware...

-- 
greg

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Any better plan for this query?..

2009-05-13 Thread Dimitri
The idea is good, but *only* pooling will be not enough. I mean if all
what pooler is doing is only keeping no more than N backends working -
it'll be not enough. You never know what exactly your query will do -
if you choose your N value to be sure to not overload CPU and then
some of your queries start to read from disk - you waste your idle CPU
time because it was still possible to run other queries requiring CPU
time rather I/O, etc...

I wrote some ideas about an ideal solution here (just omit the word
mysql - as it's a theory it's valable for any db engine):
http://dimitrik.free.fr/db_STRESS_MySQL_540_and_others_Apr2009.html#note_5442

Rgds,
-Dimitri

On 5/13/09, Kevin Grittner kevin.gritt...@wicourts.gov wrote:
 Glenn Maynard glennfmayn...@gmail.com wrote:
 I'm sorry, but I'm confused.  Everyone keeps talking about
 connection pooling, but Dimitri has said repeatedly that each client
 makes a single connection and then keeps it open until the end of
 the test, not that it makes a single connection per SQL query.
 Connection startup costs shouldn't be an issue.  Am I missing
 something here?

 Quite aside from the overhead of spawning new processes, if you have
 more active connections than you have resources for them to go after,
 you just increase context switching and resource contention, both of
 which have some cost, without any offsetting gains.  That would tend
 to explain why performance tapers off after a certain point.  A
 connection pool which queues requests prevents this degradation.

 It would be interesting, with each of the CPU counts, to profile
 PostgreSQL at the peak of each curve to see where the time goes when
 it is operating with an optimal poolsize.  Tapering after that point
 is rather uninteresting, and profiles would be less useful beyond that
 point, as the noise from the context switching and resource contention
 would make it harder to spot issues that really matter..

 -Kevin

 --
 Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-performance


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Any better plan for this query?..

2009-05-13 Thread Scott Carey

On 5/13/09 3:22 AM, Dimitri dimitrik...@gmail.com wrote:

 Hi Scott,
 
 On 5/12/09, Scott Carey sc...@richrelevance.com wrote:
 Although nobody wants to support it, he should try the patch that Jignesh K.
 Shah (from Sun) proposed that makes ProcArrayLock lighter-weight.  If it
 makes 32 cores much faster, then we have a smoking gun.
 
 Although everyone here is talking about this as an 'unoptimal' solution, the
 fact is there is no evidence that a connection pooler will fix the
 scalability from 16  32 cores.
 Certainly a connection pooler will help most results, but it may not fix the
 scalability problem.
 
 A question for Dimitri:
 What is the scalability from 16  32 cores at the 'peak' load that occurs
 near 2x the CPU count?  Is it also poor?  If this is also poor, IMO the
 community here should not be complaining about this unopimal case -- a
 connection pooler at that stage does little and prepared statements will
 increase throughput but not likely alter scalability.
 
 I'm attaching a small graph showing a TPS level on PG 8.4 depending on
 number of cores (X-axis is a number of concurrent users, Y-axis is the
 TPS number). As you may see TPS increase is near linear while moving
 from 8 to 16 cores, while on 32cores even it's growing slightly
 differently, what is unclear is why TPS level is staying limited to
 11.000 TPS on 32cores. And it's pure read-only workload.
 

Interesting.  What hardware is this, btw? Looks like the 32 core system
probably has 2x the CPU and a bit less interconnect efficiency versus the 16
core one (which would be typical).
Is the 16 core case the same, but with fewer cores per processor active?  Or
fewer processors total?
Understanding the scaling difference may require a better understanding of
the other differences besides core count.

 
 If that result scales, then the short term answer is a connection pooler.
 
 In the tests that Jingesh ran -- making the ProcArrayLock faster helped the
 case where connections = 2x the CPU core count quite a bit.
 
 The thread about the CPU scalability is Proposal of tunable fix for
 scalability of 8.4, originally posted by Jignesh K. Shah
 j.k.s...@sun.com, March 11 2009.
 
 It would be very useful to see results of this benchmark with:
 1. A Connection Pooler
 
 will not help, as each client is *not* disconnecting/reconnecting
 during the test, as well PG is keeping well even 256 users. And TPS
 limit is reached already on 64 users, don't think pooler will help
 here.
 

Actually, it might help a little.  Postgres has a flaw that makes backends
block on a lock briefly based on the number of total backends -- active or
completely passive.  Your tool has some (very small) user-side delay and a
connection pooler would probably allow 64 of your users to efficiently 'fit'
in 48 or so connection pooler slots.

It is not about connecting and disconnecting in this case, its about
minimizing Postgres' process count.  If this does help, it would hint at
certain bottlenecks.  If it doesn't it would point elsewhere (and quiet some
critics).

However, its unrealistic for any process-per-connection system to have less
backends than about 2x the core count -- else any waiting on I/O or network
will just starve CPU.  So this would just be done for research, not a real
answer to making it scale better.

For those who say but, what if its I/O bound!   You don't need more
backends then!:   Well you don't need more CPU either if you're I/O bound.
By definition, CPU scaling tests imply the I/O can keep up.


 2. Jignesh's patch
 
 I've already tested it and it did not help in my case because the real
 problem is elsewhere.. (however, I did not test it yet with my latest
 config params)
 

Great to hear that! -- That means this case is probably not ProcArrayLock.
If its Solaris, could we get:
1. What is the CPU stats when it is in the inefficient state near 64 or 128
concurrent users (vmstat, etc.   I'm interested in CPU in
user/system/idle/wait time, and context switches/sec mostly).
2.  A Dtrace probe on the postgres locks -- we might be able to identify
something here.

The results here would be useful -- if its an expected condition in the
planner or parser, it would be useful confirmation.  If its something
unexpected and easy to fix -- it might be changed relatively soon.

If its not easy to detect, it could be many other things -- but the process
above at least rules some things out and better characterizes the state.

 3. Prepared statements
 
 
 yes, I'm preparing this test.
 
 #3 is important, because prepared statements are ideal for queries that
 perform well with low statistics_targets, and not ideal for those that
 require high statistics targets.  Realistically, an app won't have more than
 a couple dozen statement forms to prepare.  Setting the default statistics
 target to 5 is just a way to make some other query perform bad.
 
 Agree, but as you may have a different statistic target *per* table it
 should not be a problem. What is 

Re: [PERFORM] AMD Shanghai versus Intel Nehalem

2009-05-13 Thread Scott Carey


On 5/12/09 10:06 PM, Scott Marlowe scott.marl...@gmail.com wrote:

 Just realized I made a mistake, I was under the impression that
 Shanghai CPUs had 8xxx numbers while barcelona had 23xx numbers.  I
 was wrong, it appears the 8xxx numbers are for 4+ socket servers while
 the 23xx numbers are for 2 or fewer sockets.  So, there are several
 quite affordable shanghai cpus out there, and many of the ones I
 quoted as barcelonas are in fact shanghais with the larger 6M L2
 cache.
 

At this point, I wouldn¹t go below 5520 on the Nehalem side (turbo + HT is
just too big a jump, as is the 1066Mhz versus 800Mhz memory jump).  Its $100
extra per CPU on a $10K + machine.
The next 'step' is the 5550, since it can run 1333Mhz memory and has 2x the
turbo -- but you would have to be more CPU bound for that.  I wouldn't worry
about the 5530 or 5540, they will only scale a little up from the 5520.

For Opterons, I wouldn't touch anything but a Shanghai these days since its
just not much more and we know the cache differences are very important for
DB loads.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] superlative missuse

2009-05-13 Thread Chris Browne
cl...@uah.es (Angel Alvarez) writes:
 more optimal plan... 
 morreoptimal configuration...

 we suffer a 'more optimal' superlative missuse

 there is  not so 'more optimal' thing but a simple 'better' thing.

 im not native english speaker but i think it still applies.

If I wanted to be pedantic about it, I'd say that the word nearly is
missing.

That is, it would be strictly correct if one instead said more
nearly optimal.

I don't imagine people get too terribly confused by the lack of the
word nearly, so I nearly don't care :-).
-- 
select 'cbbrowne' || '@' || 'acm.org';
http://linuxfinances.info/info/languages.html
Bureaucracies interpret communication as damage and route around it
-- Jamie Zawinski

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Any better plan for this query?..

2009-05-13 Thread Kevin Grittner
Dimitri dimitrik...@gmail.com wrote: 
 The idea is good, but *only* pooling will be not enough. I mean if
 all what pooler is doing is only keeping no more than N backends
 working - it'll be not enough. You never know what exactly your
 query will do - if you choose your N value to be sure to not
 overload CPU and then some of your queries start to read from disk -
 you waste your idle CPU time because it was still possible to run
 other queries requiring CPU time rather I/O, etc...
 
I never meant to imply that CPUs were the only resources which
mattered.  Network and disk I/O certainly come into play.  I would
think that various locks might count.  You have to benchmark your
actual workload to find the sweet spot for your load on your hardware.
 I've usually found it to be around (2 * cpu count) + (effective
spindle count), where effective spindle count id determined not only
by your RAID also your access pattern.  (If everything is fully
cached, and you have no write delays because of a BBU RAID controller
with write-back, effective spindle count is zero.)
 
Since the curve generally falls off more slowly past the sweet spot
than it climbs to get there, I tend to go a little above the apparent
sweet spot to protect against bad performance in a different load mix
than my tests.
 
 I wrote some ideas about an ideal solution here (just omit the
 word mysql - as it's a theory it's valable for any db engine):

http://dimitrik.free.fr/db_STRESS_MySQL_540_and_others_Apr2009.html#note_5442
 
I've seen similar techniques used in other databases, and I'm far from
convinced that it's ideal or optimal.
 
-Kevin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] superlative missuse

2009-05-13 Thread Angel Alvarez
El Miércoles, 13 de Mayo de 2009 Tom Lane escribió:
 Chris Browne cbbro...@acm.org writes:
  cl...@uah.es (Angel Alvarez) writes:
  there is  not so 'more optimal' thing but a simple 'better' thing.
 
  If I wanted to be pedantic about it, I'd say that the word nearly is
  missing.
 
  That is, it would be strictly correct if one instead said more
  nearly optimal.
 
 This sort of construction is longstanding practice in English anyway.
 The most famous example I can think of offhand is in the US
 Constitution: ... in order to form a more perfect union ...

Wooa!!

So Tom lane for President still applies!! :-)

Thanks all of you. 

 
   regards, tom lane
 



-- 
No imprima este correo si no es necesario. El medio ambiente está en nuestras 
manos.
---

 Angel J. Alvarez Miguel, Sección de Sistemas 
 Area de Explotación, Servicios Informáticos
 
 Edificio Torre de Control, Campus Externo UAH
 Alcalá de Henares 28806, Madrid  ** ESPAÑA **
 
 RedIRIS Jabber: angel.uah...@rediris.es
[www.uah.es]--- 
Tú lo compras, yo lo copio. Todo legal.
-- 
Agua para todo? No, Agua para Todos.

Clist UAH a.k.a Angel
-[www.uah.es]---

No le daría Cocacola Zero, ni a mi peor enemigo. Para eso está el gas Mostaza 
que es mas piadoso.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] AMD Shanghai versus Intel Nehalem

2009-05-13 Thread Scott Carey

On 5/12/09 11:08 PM, Arjen van der Meijden acmmail...@tweakers.net
wrote:

 We have a dual E5540 with 16GB (I think 1066Mhz) memory here, but no AMD
 Shanghai. We haven't done PostgreSQL benchmarks yet, but given the
 previous experiences, PostgreSQL should be equally faster compared to mysql.
 
 Our databasebenchmark is actually mostly a cpu/memory-benchmark.
 Comparing the results of the dual E5540 (2.53Ghz with HT enabled) to a
 dual Intel X5355 (2.6Ghz quad core two from 2007), the peek load has
 increased from somewhere between 7 and 10 concurrent clients to
 somewhere around 25, suggesting better scalable hardware. With the 25
 concurrent clients we handled 2.5 times the amount of queries/second
 compared to the 7 concurrent client-score for the X5355, both in MySQL
 5.0.41. At 7 CC we still had 1.7 times the previous result.
 

Excellent!  That is a pretty huge boost.   I'm curious which aspects of this
new architecture helped the most.  For Postgres, the following would seem
the most relevant:
1.  Shared L3 cache per processors -- more efficient shared datastructure
access.
2.  Faster atomic operations -- CompareAndSwap, etc are much faster.
3.  Faster cache coherency.
4.  Lower latency RAM with more overall bandwidth (Opteron style).

Can you do a quick and dirty memory bandwidth test? (assuming linux)
On the older X5355 machine and the newer E5540, try:
/sbin/hdparm -T /dev/sddevice

Where device is a valid letter for a device on your system.

Here are the results for me on an older system with dual Intel E5335 (2Ghz,
4MB cache, family 6 model 15)
Best result out of 5 (its not all that consistent, + or minus 10%)
/dev/sda:
 Timing cached reads:   10816 MB in  2.00 seconds = 5416.89 MB/sec

And a newer system with dual Xeon X5460 (3.16Ghz, 6MB cache, family 6 model
23)
Best of 7 results:
/dev/sdb:
 Timing cached reads:   26252 MB in  1.99 seconds = 13174.42 MB/sec

Its not a very accurate measurement, but its quick and highlights relative
hardware differences very easily.


 I'm not really sure how the shanghai cpu's compare to those older
 X5355's, the AMD's should be faster, but how much?
 

I'm not sure either, and the Xeon platforms have evolved such that the
chipsets and RAM configurations matter as much as the processor does.

 I've no idea if we get a Shanghai to compare it with, but we will get a
 dual X5570 soon on which we'll repeat some of the tests, so that should
 at least help a bit with scaling the X5570-results around the world down.
 
 Best regards,
 
 Arjen
 


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] increase index performance

2009-05-13 Thread Thomas Finneid

Matthew Wakeling wrote:
Thomas, the order of columns in the index matters. The index is 
basically a tree structure, which resolves the left-most column before 
resolving the column to the right of it. So to answer your query, it 
will resolve the city_id, then it will have to scan almost all of the 
tree under that, because you are not constraining for street_id. A much 
better index to answer your query is (city_id, house_id, floor_id) - 
then it can just look up straight away. Instead of the index returning 
20 rows to check, it will return just the 2000.


Thats something I was a bit unsure about, because of the cardinality of 
the data. But thanks, I will try it. Just need to populate a new data 
base with the new index. (Apparently, creating a new index on an already 
existing database is slower than just recreating the db, when the db is 
250GB big)



thomas

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Any better plan for this query?..

2009-05-13 Thread Dimitri Fontaine

Hi,

Le 13 mai 09 à 18:42, Scott Carey a écrit :

will not help, as each client is *not* disconnecting/reconnecting
during the test, as well PG is keeping well even 256 users. And TPS
limit is reached already on 64 users, don't think pooler will help
here.


Actually, it might help a little.  Postgres has a flaw that makes  
backends
block on a lock briefly based on the number of total backends --  
active or
completely passive.  Your tool has some (very small) user-side delay  
and a
connection pooler would probably allow 64 of your users to  
efficiently 'fit'

in 48 or so connection pooler slots.


It seems you have think time, and I'm only insisting on what Scott  
said, but having thinktime means a connection pool can help. Pgbouncer  
is a good choice because it won't even attempt to parse the queries,  
and it has a flexible configuration.



3. Prepared statements

yes, I'm preparing this test.


It's possible to use prepared statement and benefit from pgbouncer at  
the same time, but up until now it requires the application to test  
whether its statements are already prepared at connection time,  
because the application is not controlling when pgbouncer is reusing  
an existing backend or giving it a fresh one.


As I think I need this solution too, I've coded a PG module to scratch  
that itch this morning, and just published it (BSD licenced) on  
pgfoundry:

  http://preprepare.projects.postgresql.org/README.html
  http://cvs.pgfoundry.org/cgi-bin/cvsweb.cgi/preprepare/preprepare/

With this module and the proper pgbouncer setup (connect_query='SELECT  
prepare_all();') the application has no more to special case the fresh- 
backend-nothing-prepared case, it's all transparent, just replace your  
SELECT query with its EXECUTE foo(x, y, z) counter part.


I've took the approach to setup the prepared statements themselves  
into a table with columns name and statement, this latter one  
containing the full PREPARE SQL command. There's a custom variable  
preprepare.relation that has to be your table name (shema qualified).  
Each statement that you then put in there will get prepared when you  
SELECT prepare_all();


Hope this helps, regards,
--
dim
--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance