Re: [PERFORM] Hardware suggestions for maximum read performance

2013-05-19 Thread Greg Smith

On 5/13/13 6:36 PM, Mike McCann wrote:

stoqs_march2013_s=# explain analyze select * from
stoqs_measuredparameter order by datavalue;

QUERY PLAN

--
  Sort  (cost=422106.15..430560.68 rows=3381814 width=20) (actual
time=2503.078..2937.130 rows=3381814 loops=1)
Sort Key: datavalue
Sort Method: quicksort  Memory: 362509kB
-  Seq Scan on stoqs_measuredparameter  (cost=0.00..55359.14
rows=3381814 width=20) (actual time=0.016..335.745 rows=3381814 loops=1)
  Total runtime: 3094.601 ms
(5 rows)

I tried changing random_page_cost to from 4 to 1 and saw no change.


Have you tried putting an index by datavalue on this table?  Once you've 
done that, then changing random_page_cost will make using that index 
look less expensive.  Sorting chews through a good bit of CPU time, and 
that's where all of your runtime is being spent at--once you increase 
work_mem up very high that is.



I'm wondering now what changes might get this query to run in less than
one second.  If all the data is in memory, then will faster CPU and
memory be the things that help?


You're trying to fix a fundamental design issue with hardware.  That 
usually doesn't go well.  Once you get a box big enough to hold the 
whole database in RAM, beyond that the differences between server 
systems are relatively small.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Hardware suggestions for maximum read performance

2013-05-19 Thread Scott Marlowe
On Sun, May 19, 2013 at 8:44 PM, Greg Smith g...@2ndquadrant.com wrote:
 On 5/13/13 6:36 PM, Mike McCann wrote:

 stoqs_march2013_s=# explain analyze select * from
 stoqs_measuredparameter order by datavalue;

 QUERY PLAN

 --
   Sort  (cost=422106.15..430560.68 rows=3381814 width=20) (actual
 time=2503.078..2937.130 rows=3381814 loops=1)
 Sort Key: datavalue
 Sort Method: quicksort  Memory: 362509kB
 -  Seq Scan on stoqs_measuredparameter  (cost=0.00..55359.14
 rows=3381814 width=20) (actual time=0.016..335.745 rows=3381814
 loops=1)
   Total runtime: 3094.601 ms
 (5 rows)

 I tried changing random_page_cost to from 4 to 1 and saw no change.


 Have you tried putting an index by datavalue on this table?  Once you've
 done that, then changing random_page_cost will make using that index look
 less expensive.  Sorting chews through a good bit of CPU time, and that's
 where all of your runtime is being spent at--once you increase work_mem up
 very high that is.

This++ plus cluster on that index if you can.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Hardware suggestions for maximum read performance

2013-05-13 Thread Mike McCann
On May 7, 2013, at 4:21 PM, Jeff Janes wrote:

 On Thu, May 2, 2013 at 6:35 PM, Scott Marlowe scott.marl...@gmail.com wrote:
 On Thu, May 2, 2013 at 5:11 PM, Mike McCann mcc...@mbari.org wrote:
  Hello,
 
  We are in the fortunate situation of having more money than time to help
  solve our PostgreSQL 9.1 performance problem.
 
  Our server hosts databases that are about 1 GB in size with the largest
  tables having order 10 million 20-byte indexed records. The data are loaded
  once and then read from a web app and other client programs.  Some of the
  queries execute ORDER BY on the results. There are typically less than a
  dozen read-only concurrent connections to any one database.
 
 I wouldn't count on this being a problem that can be fixed merely by throwing 
 money at it.
 
 How many rows does any one of these queries need to access and then ORDER BY?
 
 ...
 
 
  HP ProLiant DL360p Gen 8
  Dual Intel Xeon 2.4GHz 4-core E5-2609 CPUs
  64GB RAM
  2x146GB 15K SAS hard drives
  3x200GB SATA SLC SSDs
  + the usual accessories (optical drive, rail kit, dual power supplies)
 
 If your DB is 1G, and will grow to 10G then the IO shouldn't be any
 problem, as the whole db should be cached in memory.
 
 
 But it can take a surprisingly long time to get it cached in the first place, 
 from a cold start.
 
 If that is the problem, pg_prewarm could help.  
 
 
 Cheers,
 
 Jeff

Thank you everyone for your suggestions.

It's clear that our current read performance was not limited by hardware.  An 
'explain analyze' for a sample query is:

stoqs_march2013_s=# show work_mem;
 work_mem 
--
 1MB
(1 row)

stoqs_march2013_s=# explain analyze select * from stoqs_measuredparameter order 
by datavalue;
QUERY PLAN  
  
--
 Sort  (cost=541002.15..549456.68 rows=3381814 width=20) (actual 
time=6254.780..7244.074 rows=3381814 loops=1)
   Sort Key: datavalue
   Sort Method: external merge  Disk: 112424kB
   -  Seq Scan on stoqs_measuredparameter  (cost=0.00..55359.14 rows=3381814 
width=20) (actual time=0.011..354.385 rows=3381814 loops=1)
 Total runtime: 7425.854 ms
(5 rows)


Increasing work_mem to 355 MB improves the performance by a factor of 2:

stoqs_march2013_s=# set work_mem='355MB';
SET
stoqs_march2013_s=# explain analyze select * from stoqs_measuredparameter order 
by datavalue;
QUERY PLAN  
  
--
 Sort  (cost=422106.15..430560.68 rows=3381814 width=20) (actual 
time=2503.078..2937.130 rows=3381814 loops=1)
   Sort Key: datavalue
   Sort Method: quicksort  Memory: 362509kB
   -  Seq Scan on stoqs_measuredparameter  (cost=0.00..55359.14 rows=3381814 
width=20) (actual time=0.016..335.745 rows=3381814 loops=1)
 Total runtime: 3094.601 ms
(5 rows)

I tried changing random_page_cost to from 4 to 1 and saw no change.

I'm wondering now what changes might get this query to run in less than one 
second.  If all the data is in memory, then will faster CPU and memory be the 
things that help?

We have an alternate (a bit more conventional) server configuration that we are 
considering:

HP ProLiant DL360p Gen 8
Dual Intel Xeon 3.3GHz 4-core E5-2643 CPUs
128GB PC3-12800 RAM
16x146GB 15K SAS hard drives
HP Smart Array P822/2GB FBWC controller + P420i w/ 2GB FBWC
+ the usual accessories (optical drive, rail kit, dual power supplies)


All suggestions welcomed!

-Mike

--
Mike McCann
Software Engineer
Monterey Bay Aquarium Research Institute
7700 Sandholdt Road
Moss Landing, CA 95039-9644
Voice: 831.775.1769  Fax: 831.775.1736 http://www.mbari.org



Re: [PERFORM] Hardware suggestions for maximum read performance

2013-05-13 Thread Jeff Janes
On Mon, May 13, 2013 at 3:36 PM, Mike McCann mcc...@mbari.org wrote:


 Increasing work_mem to 355 MB improves the performance by a factor of 2:

 stoqs_march2013_s=# set work_mem='355MB';
 SET
 stoqs_march2013_s=# explain analyze select * from stoqs_measuredparameter
 order by datavalue;
 QUERY PLAN


 --
  Sort  (cost=422106.15..430560.68 rows=3381814 width=20) (actual
 time=2503.078..2937.130 rows=3381814 loops=1)
Sort Key: datavalue
Sort Method: quicksort  Memory: 362509kB
-  Seq Scan on stoqs_measuredparameter  (cost=0.00..55359.14
 rows=3381814 width=20) (actual time=0.016..335.745 rows=3381814 loops=1)
  Total runtime: 3094.601 ms
 (5 rows)


 I tried changing random_page_cost to from 4 to 1 and saw no change.

 I'm wondering now what changes might get this query to run in less than
 one second.



I think you are worrying about the wrong thing here.  What is a web app
going to do with 3,381,814 rows, once it obtains them?  Your current
testing is not testing the time it takes to stream that data to the client,
or for the client to do something meaningful with that data.

If you only plan to actually fetch a few dozen of those rows, then you
probably need to incorporate that into your test, either by using a LIMIT,
or by using a mock-up of the actual application to do some timings.

Also, what is the type and collation of the column you are sorting on?
 non-'C' collations of text columns sort about 3 times slower than 'C'
collation does.



 If all the data is in memory, then will faster CPU and memory be the
 things that help?


Yes, those would help (it is not clear to me which of the two would help
more), but I think you need to rethink your design of sending the entire
database table to the application server for each page-view.


Cheers,

Jeff


Re: [PERFORM] Hardware suggestions for maximum read performance

2013-05-13 Thread Mike McCann
On May 13, 2013, at 4:24 PM, Jeff Janes wrote:

 On Mon, May 13, 2013 at 3:36 PM, Mike McCann mcc...@mbari.org wrote:
 
 Increasing work_mem to 355 MB improves the performance by a factor of 2:
 
 stoqs_march2013_s=# set work_mem='355MB';
 SET
 stoqs_march2013_s=# explain analyze select * from stoqs_measuredparameter 
 order by datavalue;
 QUERY PLAN
 
 --
  Sort  (cost=422106.15..430560.68 rows=3381814 width=20) (actual 
 time=2503.078..2937.130 rows=3381814 loops=1)
Sort Key: datavalue
Sort Method: quicksort  Memory: 362509kB
-  Seq Scan on stoqs_measuredparameter  (cost=0.00..55359.14 rows=3381814 
 width=20) (actual time=0.016..335.745 rows=3381814 loops=1)
  Total runtime: 3094.601 ms
 (5 rows)
 
 I tried changing random_page_cost to from 4 to 1 and saw no change.
 
 I'm wondering now what changes might get this query to run in less than one 
 second.  
 
 
 I think you are worrying about the wrong thing here.  What is a web app going 
 to do with 3,381,814 rows, once it obtains them?  Your current testing is not 
 testing the time it takes to stream that data to the client, or for the 
 client to do something meaningful with that data.
 
 If you only plan to actually fetch a few dozen of those rows, then you 
 probably need to incorporate that into your test, either by using a LIMIT, or 
 by using a mock-up of the actual application to do some timings.
 
 Also, what is the type and collation of the column you are sorting on?  
 non-'C' collations of text columns sort about 3 times slower than 'C' 
 collation does.
 
  
 If all the data is in memory, then will faster CPU and memory be the things 
 that help?
 
 Yes, those would help (it is not clear to me which of the two would help 
 more), but I think you need to rethink your design of sending the entire 
 database table to the application server for each page-view.
 
 
 Cheers,
 
 Jeff

Hi Jeff,

The datavalue column is double precision:

stoqs_march2013_s=# \d+ stoqs_measuredparameter
  Table 
public.stoqs_measuredparameter
 Column |   Type   |  Modifiers 
  | Storage | Description 
+--+--+-+-
 id | integer  | not null default 
nextval('stoqs_measuredparameter_id_seq'::regclass) | plain   | 
 measurement_id | integer  | not null   
  | plain   | 
 parameter_id   | integer  | not null   
  | plain   | 
 datavalue  | double precision | not null   
  | plain   | 
Indexes:
stoqs_measuredparameter_pkey PRIMARY KEY, btree (id)
stoqs_measuredparameter_measurement_id_parameter_id_key UNIQUE 
CONSTRAINT, btree (measurement_id, parameter_id)
stoqs_measuredparameter_datavalue btree (datavalue)
stoqs_measuredparameter_measurement_id btree (measurement_id)
stoqs_measuredparameter_parameter_id btree (parameter_id)
Foreign-key constraints:
stoqs_measuredparameter_measurement_id_fkey FOREIGN KEY (measurement_id) 
REFERENCES stoqs_measurement(id) DEFERRABLE INITIALLY DEFERRED
stoqs_measuredparameter_parameter_id_fkey FOREIGN KEY (parameter_id) 
REFERENCES stoqs_parameter(id) DEFERRABLE INITIALLY DEFERRED
Has OIDs: no


Thanks for the suggestion and advice to examine the web app performance.  We've 
actually taken quite a few steps to optimize how the web app works. The example 
query I provided is a simple worst-case one that we can use to help us decide 
on the proper hardware.  An actual query performed by the web app is:

stoqs_march2013_s=# explain analyze SELECT stoqs_measuredparameter.id,
stoqs_march2013_s-#stoqs_parameter.name AS parameter__name,
stoqs_march2013_s-#stoqs_parameter.standard_name AS 
parameter__standard_name,
stoqs_march2013_s-#stoqs_measurement.depth AS measurement__depth,
stoqs_march2013_s-#stoqs_measurement.geom AS measurement__geom,
stoqs_march2013_s-#stoqs_instantpoint.timevalue AS 
measurement__instantpoint__timevalue,
stoqs_march2013_s-#stoqs_platform.name AS 
measurement__instantpoint__activity__platform__name,
stoqs_march2013_s-#stoqs_measuredparameter.datavalue AS datavalue,
stoqs_march2013_s-#stoqs_parameter.units AS parameter__units
stoqs_march2013_s-# FROM stoqs_parameter p1,
stoqs_march2013_s-#  stoqs_measuredparameter
stoqs_march2013_s-#  INNER JOIN stoqs_measurement ON 
(stoqs_measuredparameter.measurement_id = 

Re: [PERFORM] Hardware suggestions for maximum read performance

2013-05-13 Thread Scott Marlowe
On Mon, May 13, 2013 at 5:58 PM, Mike McCann mcc...@mbari.org wrote:

 We assume that steps taken to improve the worst-case query scenario will
 also improve these kind of queries.  If anything above pops out as needing
 better planning please let us know that too!

Bad assumption. If your real workload will be queries like the one
here that takes 700 ms, but you'll be running 10,000 of them a second,
you're tuning / hardware choices are going to be much different then
if your query is going to be the previous 7 second one. Use realistic
queries, not ones that are nothing like what your real ones will be.
then use pgbench and its ability to run custom sql scripts to get a
REAL idea how your hardware performs. Note that if you will run the
slow query you posted like once a minute and roll it up or cache it
then don't get too worried about it. Pay attention to the queries that
will add up, in aggregate, to your greatest load.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Hardware suggestions for maximum read performance

2013-05-07 Thread Jeff Janes
On Thu, May 2, 2013 at 6:35 PM, Scott Marlowe scott.marl...@gmail.comwrote:

 On Thu, May 2, 2013 at 5:11 PM, Mike McCann mcc...@mbari.org wrote:
  Hello,
 
  We are in the fortunate situation of having more money than time to help
  solve our PostgreSQL 9.1 performance problem.
 
  Our server hosts databases that are about 1 GB in size with the largest
  tables having order 10 million 20-byte indexed records. The data are
 loaded
  once and then read from a web app and other client programs.  Some of the
  queries execute ORDER BY on the results. There are typically less than a
  dozen read-only concurrent connections to any one database.


I wouldn't count on this being a problem that can be fixed merely by
throwing money at it.

How many rows does any one of these queries need to access and then ORDER
BY?

...


  HP ProLiant DL360p Gen 8
  Dual Intel Xeon 2.4GHz 4-core E5-2609 CPUs
  64GB RAM
  2x146GB 15K SAS hard drives
  3x200GB SATA SLC SSDs
  + the usual accessories (optical drive, rail kit, dual power supplies)

 If your DB is 1G, and will grow to 10G then the IO shouldn't be any
 problem, as the whole db should be cached in memory.



But it can take a surprisingly long time to get it cached in the first
place, from a cold start.

If that is the problem, pg_prewarm could help.


Cheers,

Jeff


Re: [PERFORM] Hardware suggestions for maximum read performance

2013-05-06 Thread Yuri Levinsky
Mike,

According to your budget the following or similar might be useful for
you:

HP 365GB Multi Level Cell G2 PCIe ioDrive2 for ProLiant Servers

 

This PCIe card-based direct-attach solid state storage technology
solutions for application performance enhancement. I believe you can
find cheaper solutions on the market that will provide same performance
characteristics (935,000 write IOPS, up to 892,000 read IOPS, up to 3
GB/s Bandwidth). 

 

 

Sincerely yours,

 

 

Yuri Levinsky, DBA

Celltick Technologies Ltd., 32 Maskit St., Herzliya 46733, Israel

Mobile: +972 54 6107703, Office: +972 9 9710239; Fax: +972 9 9710222

 

From: pgsql-performance-ow...@postgresql.org
[mailto:pgsql-performance-ow...@postgresql.org] On Behalf Of Mike McCann
Sent: Friday, May 03, 2013 2:11 AM
To: pgsql-performance@postgresql.org
Subject: [PERFORM] Hardware suggestions for maximum read performance

 

Hello,

 

We are in the fortunate situation of having more money than time to help
solve our PostgreSQL 9.1 performance problem.

 

Our server hosts databases that are about 1 GB in size with the largest
tables having order 10 million 20-byte indexed records. The data are
loaded once and then read from a web app and other client programs.
Some of the queries execute ORDER BY on the results. There are typically
less than a dozen read-only concurrent connections to any one database.

 

SELECTs for data are taking 10s of seconds. We'd like to reduce this to
web app acceptable response times (less than 1 second). If this is
successful then the size of the database will grow by a factor of ten -
we will still want sub-second response times.  We are in the process of
going through the excellent suggestions in the PostgreSQL 9.0 High
Performance book to identify the bottleneck (we have reasonable
suspicions that we are I/O bound), but would also like to place an order
soon for the dedicated server which will host the production databases.
Here are the specs of a server that we are considering with a budget of
$13k US:

 

HP ProLiant DL360p Gen 8

Dual Intel Xeon 2.4GHz 4-core E5-2609 CPUs

64GB RAM

2x146GB 15K SAS hard drives

3x200GB SATA SLC SSDs

+ the usual accessories (optical drive, rail kit, dual power
supplies)

 

Opinions?

 

Thanks in advance for any suggestions you have.


-Mike

 

--
Mike McCann
Software Engineer
Monterey Bay Aquarium Research Institute
7700 Sandholdt Road
Moss Landing, CA 95039-9644
Voice: 831.775.1769  Fax: 831.775.1736 http://www.mbari.org 

 



This mail was received via Mail-SeCure System.

= 

image002.jpg

Re: [PERFORM] Hardware suggestions for maximum read performance

2013-05-03 Thread Arjen van der Meijden

3x200GB suggests you want to use RAID5?

Perhaps you should just pick 2x200GB and set them to RAID1. With roughly 
200GB of storage, that should still easily house your potentially 
10GB-database with ample of room to allow the SSD's to balance the 
writes. But you save the investment and its probably a bit faster with 
writes (although your raid-card may reduce or remove the differences 
with your workload).


You can then either keep the money or invest in faster cpu's. With few 
concurrent connections the E5-2643 (also a quad core, but with 3.3GHz 
cores rather than 2.4GHz) may be interesting.
Its obviously a bit of speculation to see whether that would help, but 
it should speed up sorts and other in-memory/cpu-operations (even if 
you're not - and never will be - cpu-bound right now).


Best regards,

Arjen

On 3-5-2013 1:11 Mike McCann wrote:

Hello,

We are in the fortunate situation of having more money than time to help
solve our PostgreSQL 9.1 performance problem.

Our server hosts databases that are about 1 GB in size with the largest
tables having order 10 million 20-byte indexed records. The data are
loaded once and then read from a web app and other client programs.
  Some of the queries execute ORDER BY on the results. There are
typically less than a dozen read-only concurrent connections to any one
database.

SELECTs for data are taking 10s of seconds. We'd like to reduce this to
web app acceptable response times (less than 1 second). If this is
successful then the size of the database will grow by a factor of ten -
we will still want sub-second response times.  We are in the process of
going through the excellent suggestions in the PostgreSQL 9.0 High
Performance book to identify the bottleneck (we have reasonable
suspicions that we are I/O bound), but would also like to place an order
soon for the dedicated server which will host the production databases.
Here are the specs of a server that we are considering with a budget of
$13k US:

HP ProLiant DL360p Gen 8
Dual Intel Xeon 2.4GHz 4-core E5-2609 CPUs
64GB RAM
2x146GB 15K SAS hard drives
3x200GB SATA SLC SSDs
+ the usual accessories (optical drive, rail kit, dual power supplies)

Opinions?

Thanks in advance for any suggestions you have.

-Mike

--
Mike McCann
Software Engineer
Monterey Bay Aquarium Research Institute
7700 Sandholdt Road
Moss Landing, CA 95039-9644
Voice: 831.775.1769  Fax: 831.775.1736 http://www.mbari.org




--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Hardware suggestions for maximum read performance

2013-05-03 Thread Scott Marlowe
Note that with linux (and a few other OSes) you can use RAID-1E
http://en.wikipedia.org/wiki/Non-standard_RAID_levels#RAID_1E
with an odd number of drives.

On Fri, May 3, 2013 at 12:16 AM, Arjen van der Meijden
acmmail...@tweakers.net wrote:
 3x200GB suggests you want to use RAID5?

 Perhaps you should just pick 2x200GB and set them to RAID1. With roughly
 200GB of storage, that should still easily house your potentially
 10GB-database with ample of room to allow the SSD's to balance the writes.
 But you save the investment and its probably a bit faster with writes
 (although your raid-card may reduce or remove the differences with your
 workload).

 You can then either keep the money or invest in faster cpu's. With few
 concurrent connections the E5-2643 (also a quad core, but with 3.3GHz cores
 rather than 2.4GHz) may be interesting.
 Its obviously a bit of speculation to see whether that would help, but it
 should speed up sorts and other in-memory/cpu-operations (even if you're not
 - and never will be - cpu-bound right now).

 Best regards,

 Arjen


 On 3-5-2013 1:11 Mike McCann wrote:

 Hello,

 We are in the fortunate situation of having more money than time to help
 solve our PostgreSQL 9.1 performance problem.

 Our server hosts databases that are about 1 GB in size with the largest
 tables having order 10 million 20-byte indexed records. The data are
 loaded once and then read from a web app and other client programs.
   Some of the queries execute ORDER BY on the results. There are
 typically less than a dozen read-only concurrent connections to any one
 database.

 SELECTs for data are taking 10s of seconds. We'd like to reduce this to
 web app acceptable response times (less than 1 second). If this is
 successful then the size of the database will grow by a factor of ten -
 we will still want sub-second response times.  We are in the process of
 going through the excellent suggestions in the PostgreSQL 9.0 High
 Performance book to identify the bottleneck (we have reasonable
 suspicions that we are I/O bound), but would also like to place an order
 soon for the dedicated server which will host the production databases.
 Here are the specs of a server that we are considering with a budget of
 $13k US:

 HP ProLiant DL360p Gen 8
 Dual Intel Xeon 2.4GHz 4-core E5-2609 CPUs
 64GB RAM
 2x146GB 15K SAS hard drives
 3x200GB SATA SLC SSDs
 + the usual accessories (optical drive, rail kit, dual power supplies)

 Opinions?

 Thanks in advance for any suggestions you have.

 -Mike

 --
 Mike McCann
 Software Engineer
 Monterey Bay Aquarium Research Institute
 7700 Sandholdt Road
 Moss Landing, CA 95039-9644
 Voice: 831.775.1769  Fax: 831.775.1736 http://www.mbari.org



 --
 Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-performance



--
To understand recursion, one must first understand recursion.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Hardware suggestions for maximum read performance

2013-05-03 Thread Julien Cigar

On 05/03/2013 01:11, Mike McCann wrote:

Hello,



Hello,

We are in the fortunate situation of having more money than time to 
help solve our PostgreSQL 9.1 performance problem.


Our server hosts databases that are about 1 GB in size with the 
largest tables having order 10 million 20-byte indexed records. The 
data are loaded once and then read from a web app and other client 
programs.  Some of the queries execute ORDER BY on the results. There 
are typically less than a dozen read-only concurrent connections to 
any one database.




I would first check the spurious queries .. 10 millions rows isn't that 
huge. Perhaps you could paste your queries and an explain analyze of 
them ..? You could also log slow queries and use the auto_explain module


SELECTs for data are taking 10s of seconds. We'd like to reduce this 
to web app acceptable response times (less than 1 second). If this is 
successful then the size of the database will grow by a factor of ten 
- we will still want sub-second response times.  We are in the process 
of going through the excellent suggestions in the PostgreSQL 9.0 High 
Performance book to identify the bottleneck (we have reasonable 
suspicions that we are I/O bound), but would also like to place an 
order soon for the dedicated server which will host the production 
databases. Here are the specs of a server that we are considering with 
a budget of $13k US:


HP ProLiant DL360p Gen 8
Dual Intel Xeon 2.4GHz 4-core E5-2609 CPUs
64GB RAM
2x146GB 15K SAS hard drives
3x200GB SATA SLC SSDs
+ the usual accessories (optical drive, rail kit, dual power supplies)

Opinions?

Thanks in advance for any suggestions you have.

-Mike

--
Mike McCann
Software Engineer
Monterey Bay Aquarium Research Institute
7700 Sandholdt Road
Moss Landing, CA 95039-9644
Voice: 831.775.1769  Fax: 831.775.1736 http://www.mbari.org




--
No trees were killed in the creation of this message.
However, many electrons were terribly inconvenienced.



[PERFORM] Hardware suggestions for maximum read performance

2013-05-02 Thread Mike McCann
Hello,

We are in the fortunate situation of having more money than time to help solve 
our PostgreSQL 9.1 performance problem.

Our server hosts databases that are about 1 GB in size with the largest tables 
having order 10 million 20-byte indexed records. The data are loaded once and 
then read from a web app and other client programs.  Some of the queries 
execute ORDER BY on the results. There are typically less than a dozen 
read-only concurrent connections to any one database.

SELECTs for data are taking 10s of seconds. We'd like to reduce this to web app 
acceptable response times (less than 1 second). If this is successful then the 
size of the database will grow by a factor of ten - we will still want 
sub-second response times.  We are in the process of going through the 
excellent suggestions in the PostgreSQL 9.0 High Performance book to identify 
the bottleneck (we have reasonable suspicions that we are I/O bound), but would 
also like to place an order soon for the dedicated server which will host the 
production databases. Here are the specs of a server that we are considering 
with a budget of $13k US:

HP ProLiant DL360p Gen 8
Dual Intel Xeon 2.4GHz 4-core E5-2609 CPUs
64GB RAM
2x146GB 15K SAS hard drives
3x200GB SATA SLC SSDs
+ the usual accessories (optical drive, rail kit, dual power supplies)

Opinions?

Thanks in advance for any suggestions you have.

-Mike

--
Mike McCann
Software Engineer
Monterey Bay Aquarium Research Institute
7700 Sandholdt Road
Moss Landing, CA 95039-9644
Voice: 831.775.1769  Fax: 831.775.1736 http://www.mbari.org



Re: [PERFORM] Hardware suggestions for maximum read performance

2013-05-02 Thread Scott Marlowe
On Thu, May 2, 2013 at 5:11 PM, Mike McCann mcc...@mbari.org wrote:
 Hello,

 We are in the fortunate situation of having more money than time to help
 solve our PostgreSQL 9.1 performance problem.

 Our server hosts databases that are about 1 GB in size with the largest
 tables having order 10 million 20-byte indexed records. The data are loaded
 once and then read from a web app and other client programs.  Some of the
 queries execute ORDER BY on the results. There are typically less than a
 dozen read-only concurrent connections to any one database.

 SELECTs for data are taking 10s of seconds. We'd like to reduce this to web
 app acceptable response times (less than 1 second). If this is successful
 then the size of the database will grow by a factor of ten - we will still
 want sub-second response times.  We are in the process of going through the
 excellent suggestions in the PostgreSQL 9.0 High Performance book to
 identify the bottleneck (we have reasonable suspicions that we are I/O
 bound), but would also like to place an order soon for the dedicated server
 which will host the production databases. Here are the specs of a server
 that we are considering with a budget of $13k US:

 HP ProLiant DL360p Gen 8
 Dual Intel Xeon 2.4GHz 4-core E5-2609 CPUs
 64GB RAM
 2x146GB 15K SAS hard drives
 3x200GB SATA SLC SSDs
 + the usual accessories (optical drive, rail kit, dual power supplies)

If your DB is 1G, and will grow to 10G then the IO shouldn't be any
problem, as the whole db should be cached in memory. I'd look at
whether or not you've got good query plans or not, and tuning them.
Things like setting random_cost to 1.something might be a good start,
and cranking up work mem to ~16M or so.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance