Re: [PERFORM] Configuration tips for very large database

2015-02-14 Thread Nico Sabbi

On 02/13/2015 12:19 AM, Claudio Freire wrote:

I have a table with ~800M rows, wide ones, that runs reporting queries
quite efficiently (usually seconds).

Of course, the queries don't traverse the whole table. That wouldn't
be efficient. That's probably the key there, don't make you database
process the whole thing every time if you expect it to be scalable.

What kind of queries are you running that have slowed down?

Post an explain analyze so people can diagnose. Possibly it's a
query/indexing issue rather than a hardware one.



Thanks everybody for the answers. At the moment I don't have the queries 
at hand (saturday:-) ).

I'll post them next week.

I'd really like to avoid data partitioning if possible. It's a thing 
that gives me a strong stomach ache.




--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Configuration tips for very large database

2015-02-13 Thread Graeme B. Bell

 
 
 Hi Nico,
 
 No one has mentioned the elephant in the room, but a database can
 be very I/O intensive and you may not be getting the performance
 you need from your virtual disk running on your VMware disk subsystem.
 What do IOmeter or other disk performance evaluation software report?
 
 Regards,
 Ken

Anecdatum: 

Moving from a contended VMware hard-disk based filesystem running over the 
network, to a bare metal RAID10 SSD, resulted in many DB operations running 
20-30x faster.

Table sizes circa 10-20G, millions of rows. 

Graeme.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Configuration tips for very large database

2015-02-13 Thread k...@rice.edu
On Thu, Feb 12, 2015 at 11:25:54PM +0100, Nico Sabbi wrote:
 Hello,
 I've been away from  postgres for several years, so please forgive
 me if I forgot nearly everything:-)
 
 I've just inherited a database collecting environmental data.
 There's a background process continually inserting records (not so
 often, to say the truth) and a web interface to query data.
 At the moment the record count of the db is 250M and growing all the
 time. The 3 main tables have just 3 columns.
 
 Queries get executed very very slowly, say 20 minutes. The most
 evident problem I see is that io wait load is almost always 90+%
 while querying data, 30-40% when idle (so to say).
 Obviously disk access is to blame, but I'm a bit surprised because
 the cluster where this db is running is not at all old iron: it's a
 vmware VM with 16GB ram, 4cpu 2.2Ghz, 128GB disk (half of which
 used). The disk system underlying vmware is quite powerful, this
 postgres is the only system that runs slowly in this cluster.
 I can increase resources if necessary, but..
 
 Even before analyzing queries (that I did) I'd like to know if
 someone has already succeeded in running postgres with 200-300M
 records with queries running much faster than this. I'd like to
 compare the current configuration with a super-optimized one to
 identify the parameters that need to be changed.
 Any link to a working configuration would be very appreciated.
 
 Thanks for any help,
   Nico
 

Hi Nico,

No one has mentioned the elephant in the room, but a database can
be very I/O intensive and you may not be getting the performance
you need from your virtual disk running on your VMware disk subsystem.
What do IOmeter or other disk performance evaluation software report?

Regards,
Ken


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Configuration tips for very large database

2015-02-12 Thread Kevin Grittner
Nico Sabbi nicola.sa...@poste.it wrote:

 Queries get executed very very slowly, say 20 minutes.

 I'd like to know if someone has already succeeded in running
 postgres with 200-300M records with queries running much faster
 than this.

If you go to the http://wcca.wicourts.gov/ web site, bring up any
case, and click the Court Record Events button, it will search a
table with hundreds of millions of rows.  The table is not
partitioned, but has several indexes on it which are useful for
queries such as the one that is used when you click the button.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Configuration tips for very large database

2015-02-12 Thread Nico Sabbi

On 02/12/2015 11:38 PM, Kevin Grittner wrote:


If you go to the http://wcca.wicourts.gov/ web site, bring up any
case, and click the Court Record Events button, it will search a
table with hundreds of millions of rows.  The table is not
partitioned, but has several indexes on it which are useful for
queries such as the one that is used when you click the button.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Impressive. Can you give any hint on the configuration and on the 
underlying hardware?



--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Configuration tips for very large database

2015-02-12 Thread Nico Sabbi

Hello,
I've been away from  postgres for several years, so please forgive me if 
I forgot nearly everything:-)


I've just inherited a database collecting environmental data. There's a 
background process continually inserting records (not so often, to say 
the truth) and a web interface to query data.
At the moment the record count of the db is 250M and growing all the 
time. The 3 main tables have just 3 columns.


Queries get executed very very slowly, say 20 minutes. The most evident 
problem I see is that io wait load is almost always 90+% while querying 
data, 30-40% when idle (so to say).
Obviously disk access is to blame, but I'm a bit surprised because the 
cluster where this db is running is not at all old iron: it's a vmware 
VM with 16GB ram, 4cpu 2.2Ghz, 128GB disk (half of which used). The disk 
system underlying vmware is quite powerful, this postgres is the only 
system that runs slowly in this cluster.

I can increase resources if necessary, but..

Even before analyzing queries (that I did) I'd like to know if someone 
has already succeeded in running postgres with 200-300M records with 
queries running much faster than this. I'd like to compare the current 
configuration with a super-optimized one to identify the parameters that 
need to be changed.

Any link to a working configuration would be very appreciated.

Thanks for any help,
  Nico


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Configuration tips for very large database

2015-02-12 Thread Kevin Grittner
Nico Sabbi nicola.sa...@poste.it wrote:

 Can you give any hint on the configuration and on the underlying 
 hardware?

Well, this particular web site has millions of hits per day 
(running up to about 20 queries per hit) from thousands of 
concurrent web users, while accepting logical replication from 
thousands of OLTP users via logical replication, so you probably 
don't need equivalent hardware.  If I recall correctly it is 
running 32 cores with 512GB RAM running two PostgreSQL clusters, 
each multiple TB, and each having a RAID 5 array of 40 drives, 
plus separate controllers and RAID for OS and WAL.

For server configuration, see these Wiki pages for the general
tuning techniques used:

https://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server

https://wiki.postgresql.org/wiki/Number_Of_Database_Connections

The best course to solve your problem would probably be to review 
those and see what might apply, and if you still have a problem 
pick a specific slow-running query and use the process described 
here:

https://wiki.postgresql.org/wiki/SlowQueryQuestions

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Configuration tips for very large database

2015-02-12 Thread Mathis, Jason
I can't speak to the numbers postgresql can or cannot do but the numbers
above sound very very doable. If you can get a hold of *greg smith's
postgresql high performance*, I always liked his method of tuning buffers
and checkpoints using the background writer stats. All of which can help
with the IO load and caching.

good luck!



On Thu, Feb 12, 2015 at 4:55 PM, Kevin Grittner kgri...@ymail.com wrote:

 Nico Sabbi nicola.sa...@poste.it wrote:

  Can you give any hint on the configuration and on the underlying
  hardware?

 Well, this particular web site has millions of hits per day
 (running up to about 20 queries per hit) from thousands of
 concurrent web users, while accepting logical replication from
 thousands of OLTP users via logical replication, so you probably
 don't need equivalent hardware.  If I recall correctly it is
 running 32 cores with 512GB RAM running two PostgreSQL clusters,
 each multiple TB, and each having a RAID 5 array of 40 drives,
 plus separate controllers and RAID for OS and WAL.

 For server configuration, see these Wiki pages for the general
 tuning techniques used:

 https://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server

 https://wiki.postgresql.org/wiki/Number_Of_Database_Connections

 The best course to solve your problem would probably be to review
 those and see what might apply, and if you still have a problem
 pick a specific slow-running query and use the process described
 here:

 https://wiki.postgresql.org/wiki/SlowQueryQuestions

 --
 Kevin Grittner
 EDB: http://www.enterprisedb.com
 The Enterprise PostgreSQL Company


 --
 Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-performance



Re: [PERFORM] Configuration tips for very large database

2015-02-12 Thread Claudio Freire
On Thu, Feb 12, 2015 at 7:38 PM, Kevin Grittner kgri...@ymail.com wrote:
 Nico Sabbi nicola.sa...@poste.it wrote:

 Queries get executed very very slowly, say 20 minutes.

 I'd like to know if someone has already succeeded in running
 postgres with 200-300M records with queries running much faster
 than this.

 If you go to the http://wcca.wicourts.gov/ web site, bring up any
 case, and click the Court Record Events button, it will search a
 table with hundreds of millions of rows.  The table is not
 partitioned, but has several indexes on it which are useful for
 queries such as the one that is used when you click the button.

I have a table with ~800M rows, wide ones, that runs reporting queries
quite efficiently (usually seconds).

Of course, the queries don't traverse the whole table. That wouldn't
be efficient. That's probably the key there, don't make you database
process the whole thing every time if you expect it to be scalable.

What kind of queries are you running that have slowed down?

Post an explain analyze so people can diagnose. Possibly it's a
query/indexing issue rather than a hardware one.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance