Re: [PERFORM] Tuning Tips for a new Server

2011-08-17 Thread Andy Colson

On 8/16/2011 8:35 PM, Ogden wrote:

Hope all is well. I have received tremendous help from this list prior and 
therefore wanted some more advice.

I bought some new servers and instead of RAID 5 (which I think greatly hindered 
our writing performance), I configured 6 SCSI 15K drives with RAID 10. This is 
dedicated to /var/lib/pgsql. The main OS has 2 SCSI 15K drives on a different 
virtual disk and also Raid 10, a total of 146Gb. I was thinking of putting 
Postgres' xlog directory on the OS virtual drive. Does this even make sense to 
do?

The system memory is 64GB and the CPUs are dual Intel E5645 chips (they are 
6-core each).

It is a dedicated PostgreSQL box and needs to support heavy read and moderately 
heavy writes.

Currently, I have this for the current system which as 16Gb Ram:

  max_connections = 350

work_mem = 32MB
maintenance_work_mem = 512MB
wal_buffers = 640kB

# This is what I was helped with before and made reporting queries blaze by
seq_page_cost = 1.0
random_page_cost = 3.0
cpu_tuple_cost = 0.5
effective_cache_size = 8192MB

Any help and input is greatly appreciated.

Thank you

Ogden


What seems to be the problem?  I mean, if nothing is broke, then don't 
fix it :-)


You say reporting query's are fast, and the disk's should take care of 
your slow write problem from before.  (Did you test the write 
performance?)  So, whats wrong?



-Andy

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Tuning Tips for a new Server

2011-08-17 Thread Ogden

On Aug 17, 2011, at 8:41 AM, Andy Colson wrote:

 On 8/16/2011 8:35 PM, Ogden wrote:
 Hope all is well. I have received tremendous help from this list prior and 
 therefore wanted some more advice.
 
 I bought some new servers and instead of RAID 5 (which I think greatly 
 hindered our writing performance), I configured 6 SCSI 15K drives with RAID 
 10. This is dedicated to /var/lib/pgsql. The main OS has 2 SCSI 15K drives 
 on a different virtual disk and also Raid 10, a total of 146Gb. I was 
 thinking of putting Postgres' xlog directory on the OS virtual drive. Does 
 this even make sense to do?
 
 The system memory is 64GB and the CPUs are dual Intel E5645 chips (they are 
 6-core each).
 
 It is a dedicated PostgreSQL box and needs to support heavy read and 
 moderately heavy writes.
 
 Currently, I have this for the current system which as 16Gb Ram:
 
  max_connections = 350
 
 work_mem = 32MB
 maintenance_work_mem = 512MB
 wal_buffers = 640kB
 
 # This is what I was helped with before and made reporting queries blaze by
 seq_page_cost = 1.0
 random_page_cost = 3.0
 cpu_tuple_cost = 0.5
 effective_cache_size = 8192MB
 
 Any help and input is greatly appreciated.
 
 Thank you
 
 Ogden
 
 What seems to be the problem?  I mean, if nothing is broke, then don't fix it 
 :-)
 
 You say reporting query's are fast, and the disk's should take care of your 
 slow write problem from before.  (Did you test the write performance?)  So, 
 whats wrong?


 I was wondering what the best parameters would be with my new setup. The 
work_mem obviously will increase as will everything else as it's a 64Gb machine 
as opposed to a 16Gb machine. The configuration I posted was for a 16Gb machine 
but this new one is 64Gb. I needed help in how to jump these numbers up. 

Thank you

Ogden
-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Tuning Tips for a new Server

2011-08-17 Thread Tomas Vondra
On 17 Srpen 2011, 3:35, Ogden wrote:
 Hope all is well. I have received tremendous help from this list prior and
 therefore wanted some more advice.

 I bought some new servers and instead of RAID 5 (which I think greatly
 hindered our writing performance), I configured 6 SCSI 15K drives with
 RAID 10. This is dedicated to /var/lib/pgsql. The main OS has 2 SCSI 15K
 drives on a different virtual disk and also Raid 10, a total of 146Gb. I
 was thinking of putting Postgres' xlog directory on the OS virtual drive.
 Does this even make sense to do?

Yes, but it greatly depends on the amount of WAL and your workload. If you
need to write a lot of WAL data (e.g. during bulk loading), this may
significantly improve performance. It may also help when you have a
write-heavy workload (a lot of clients updating records, background writer
etc.) as that usually means a lot of seeking (while WAL is written
sequentially).

 The system memory is 64GB and the CPUs are dual Intel E5645 chips (they
 are 6-core each).

 It is a dedicated PostgreSQL box and needs to support heavy read and
 moderately heavy writes.

What is the size of the database? So those are the new servers? What's the
difference compared to the old ones? What is the RAID controller, how much
write cache is there?

 Currently, I have this for the current system which as 16Gb Ram:

  max_connections = 350

 work_mem = 32MB
 maintenance_work_mem = 512MB
 wal_buffers = 640kB

Are you really using 350 connections? Something like #cpus + #drives is
usually recommended as a sane number, unless the connections are idle most
of the time. And even in that case a pooling is recommended usually.

Anyway if this worked fine for your workload, I don't think you need to
change those settings. I'd probably bump up the wal_buffers to 16MB - it
might help a bit, definitely won't hurt and it's so little memory it's not
worth the effort I guess.


 # This is what I was helped with before and made reporting queries blaze
 by
 seq_page_cost = 1.0
 random_page_cost = 3.0
 cpu_tuple_cost = 0.5
 effective_cache_size = 8192MB

Are you sure the cpu_tuple_cost = 0.5 is correct? That seems a bit crazy
to me, as it says reading a page sequentially is just twice as expensive
as processing it. This value should be abou 100x lower or something like
that.

What are the checkpoint settings (segments, completion target). What about
shared buffers?

Tomas


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Tuning Tips for a new Server

2011-08-17 Thread Tomas Vondra
On 17 Srpen 2011, 16:28, Ogden wrote:
  I was wondering what the best parameters would be with my new setup. The
 work_mem obviously will increase as will everything else as it's a 64Gb
 machine as opposed to a 16Gb machine. The configuration I posted was for
 a 16Gb machine but this new one is 64Gb. I needed help in how to jump
 these numbers up.

Well, that really depends on how you come to the current work_mem settings.

If you've decided that with this amount of work_mem the queries run fine
and higher values don't give you better performance (because the amount of
data that needs to be sorted / hashed) fits into the work_mem, then don't
increase it.

But if you've just set it so that the memory is not exhausted, increasing
it may actually help you.

What I think you should review is the amount of shared buffers,
checkpoints and page cache settings (see this for example
http://notemagnet.blogspot.com/2008/08/linux-write-cache-mystery.html).

Tomas


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] How to see memory usage using explain analyze ?

2011-08-17 Thread Igor Neyman

 -Original Message-
 From: hyelluas [mailto:helen_yell...@mcafee.com]
 Sent: Monday, August 15, 2011 2:33 PM
 To: pgsql-performance@postgresql.org
 Subject: Re: How to see memory usage using explain analyze ?
 
 Igor,
 
 thank you , my tests showed better performance against the larger
 summary
 tables when I splited the index for datasource_id  datex , I use to
 have a
 composed index.
 
 Regarding that index statistics - should I analyze the tables? I
 thought
 auto vacuum takes care of it.
 
 helen
 
 --
 View this message in context:
 http://postgresql.1045698.n5.nabble.com/How-to-see-memory-usage-using-
 explain-analyze-tp4694962p4701919.html
 Sent from the PostgreSQL - performance mailing list archive at
 Nabble.com.


But, having different sets of indexes, you can't compare execution
plans.
In regards to statistics, you could try to ANALYZE table manually, may
be increasing default_statistics_target.
From the docs:

default_statistics_target (integer)

Sets the default statistics target for table columns that have not
had a column-specific target set via ALTER TABLE SET STATISTICS. Larger
values increase the time needed to do ANALYZE, but might improve the
quality of the planner's estimates. The default is 10. For more
information on the use of statistics by the PostgreSQL query planner,
refer to Section 14.2.

HTH,
Igor

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] DBT-5 Postgres 9.0.3

2011-08-17 Thread bobbyw
Hi, I know this is an old thread, but I wanted to chime in since I am having
problems with this as well.

I too am trying to run dbt5 against Postgres.  Specifically I am trying to
run it against Postgres 9.1beta3.

After jumping through many hoops I ultimately was able to build dbt5 on my
debian environment, but when I attempt to run the benchmark with:

dbt5-run-workload -a pgsql -c 5000 -t 5000 -d 60 -u 1 -i ~/dbt5-0.1.0/egen 
-f 500 -w 300 -n dbt5 -p 5432 -o /tmp/results

it runs to completion but all of the dbt5 log files contain errors like:

terminate called after throwing an instance of 'pqxx::broken_connection'
  what():  could not connect to server: No such file or directory
Is the server running locally and accepting
connections on Unix domain socket
/var/run/postgresql/.s.PGSQL.5432?

I'm lead to believe that this is an error I would receive if the Postgres db
were not running, but it is.  In fact, the way dbt5-run-workload works it
starts the database automatically.  I have also confirmed it is running by
manually connecting while this benchmark is in progress (and after it has
already started the database and logged the above error).

Any thoughts on why I might be getting this error?

--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/DBT-5-Postgres-9-0-3-tp4297670p4708692.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Tuning Tips for a new Server

2011-08-17 Thread Ogden

On Aug 17, 2011, at 9:44 AM, Tomas Vondra wrote:

 On 17 Srpen 2011, 3:35, Ogden wrote:
 Hope all is well. I have received tremendous help from this list prior and
 therefore wanted some more advice.
 
 I bought some new servers and instead of RAID 5 (which I think greatly
 hindered our writing performance), I configured 6 SCSI 15K drives with
 RAID 10. This is dedicated to /var/lib/pgsql. The main OS has 2 SCSI 15K
 drives on a different virtual disk and also Raid 10, a total of 146Gb. I
 was thinking of putting Postgres' xlog directory on the OS virtual drive.
 Does this even make sense to do?
 
 Yes, but it greatly depends on the amount of WAL and your workload. If you
 need to write a lot of WAL data (e.g. during bulk loading), this may
 significantly improve performance. It may also help when you have a
 write-heavy workload (a lot of clients updating records, background writer
 etc.) as that usually means a lot of seeking (while WAL is written
 sequentially).

The database is about 200Gb so using /usr/local/pgsql/pg_xlog on a virtual disk 
with 100Gb should not be a problem with the disk space should it?

 The system memory is 64GB and the CPUs are dual Intel E5645 chips (they
 are 6-core each).
 
 It is a dedicated PostgreSQL box and needs to support heavy read and
 moderately heavy writes.
 
 What is the size of the database? So those are the new servers? What's the
 difference compared to the old ones? What is the RAID controller, how much
 write cache is there?
 

I am sorry I overlooked specifying this. The database is about 200Gb and yes 
these are new servers which bring more power (RAM, CPU) over the last one. The 
RAID Controller is a Perc H700 and there is 512Mb write cache. The servers are 
Dells. 

 Currently, I have this for the current system which as 16Gb Ram:
 
 max_connections = 350
 
 work_mem = 32MB
 maintenance_work_mem = 512MB
 wal_buffers = 640kB
 
 Are you really using 350 connections? Something like #cpus + #drives is
 usually recommended as a sane number, unless the connections are idle most
 of the time. And even in that case a pooling is recommended usually.
 
 Anyway if this worked fine for your workload, I don't think you need to
 change those settings. I'd probably bump up the wal_buffers to 16MB - it
 might help a bit, definitely won't hurt and it's so little memory it's not
 worth the effort I guess.

So just increasing the wal_buffers is okay? I thought there would be more as 
the memory in the system is now 4 times as much. Perhaps shared_buffers too 
(down below). 

 
 # This is what I was helped with before and made reporting queries blaze
 by
 seq_page_cost = 1.0
 random_page_cost = 3.0
 cpu_tuple_cost = 0.5
 effective_cache_size = 8192MB
 
 Are you sure the cpu_tuple_cost = 0.5 is correct? That seems a bit crazy
 to me, as it says reading a page sequentially is just twice as expensive
 as processing it. This value should be abou 100x lower or something like
 that.

These settings are for the old server, keep in mind. It's a 16GB machine (the 
new one is 64Gb). The value for cpu_tuple_cost should be 0.005? How are the 
other ones?


 What are the checkpoint settings (segments, completion target). What about
 shared buffers?


#checkpoint_segments = 3# in logfile segments, min 1, 16MB each
#checkpoint_timeout = 5min  # range 30s-1h
checkpoint_completion_target = 0.9  # checkpoint target duration, 0.0 - 1.0 
- was 0.5
#checkpoint_warning = 30s   # 0 disables

And

shared_buffers = 4096MB


Thank you very much

Ogden



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] DBT-5 Postgres 9.0.3

2011-08-17 Thread Andy Colson

On 8/17/2011 10:29 AM, bobbyw wrote:

Hi, I know this is an old thread, but I wanted to chime in since I am having
problems with this as well.

I too am trying to run dbt5 against Postgres.  Specifically I am trying to
run it against Postgres 9.1beta3.

After jumping through many hoops I ultimately was able to build dbt5 on my
debian environment, but when I attempt to run the benchmark with:

dbt5-run-workload -a pgsql -c 5000 -t 5000 -d 60 -u 1 -i ~/dbt5-0.1.0/egen
-f 500 -w 300 -n dbt5 -p 5432 -o /tmp/results

it runs to completion but all of the dbt5 log files contain errors like:

terminate called after throwing an instance of 'pqxx::broken_connection'
   what():  could not connect to server: No such file or directory
 Is the server running locally and accepting
 connections on Unix domain socket
/var/run/postgresql/.s.PGSQL.5432?

I'm lead to believe that this is an error I would receive if the Postgres db
were not running, but it is.  In fact, the way dbt5-run-workload works it
starts the database automatically.  I have also confirmed it is running by
manually connecting while this benchmark is in progress (and after it has
already started the database and logged the above error).

Any thoughts on why I might be getting this error?

--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/DBT-5-Postgres-9-0-3-tp4297670p4708692.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.



Its trying to connect to unix socket /var/run/postgresql/.s.PGSQL.5432,

but your postgresql.conf file probably has:
unix_socket_directory = '/tmp'


Change it to:
unix_socket_directory = '/var/run/postgresql'

and restart PG.


-Andy

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] DBT-5 Postgres 9.0.3

2011-08-17 Thread bobbyw
Awesome.. that did it!  It was actually not set at all in postgresql.conf,
although it was commented out as:

# unix_socket_directory = '' 

Presumably it was using the default of '/tmp'?

Anyway, after making that change dbt5 runs fine, but now when I try to
connect via psql I get:

psql.bin: could not connect to server: No such file or directory
Is the server running locally and accepting
connections on Unix domain socket /tmp/.s.PGSQL.5432?

Why is psql looking in /tmp?



--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/DBT-5-Postgres-9-0-3-tp4297670p4709231.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] DBT-5 Postgres 9.0.3

2011-08-17 Thread k...@rice.edu
On Wed, Aug 17, 2011 at 10:59:12AM -0700, bobbyw wrote:
 Awesome.. that did it!  It was actually not set at all in postgresql.conf,
 although it was commented out as:
 
 # unix_socket_directory = '' 
 
 Presumably it was using the default of '/tmp'?
 
 Anyway, after making that change dbt5 runs fine, but now when I try to
 connect via psql I get:
 
 psql.bin: could not connect to server: No such file or directory
 Is the server running locally and accepting
 connections on Unix domain socket /tmp/.s.PGSQL.5432?
 
 Why is psql looking in /tmp?
 

Because that is the default location. If you want to change it, you need
to use the -h commandline option.

Regards,
Ken

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Calculating statistic via function rather than with query is slowing my query

2011-08-17 Thread Anish Kejariwal
Hi everyone,

I'm using postgres 9.0.3, and here's the OS I'm running this on:
Linux 2.6.18-238.12.1.el5xen #1 SMP Tue May 31 14:02:29 EDT 2011 x86_64
x86_64 x86_64 GNU/Linux

I have a fairly straight forward query.  I'm doing a group by on an ID, and
then calculating some a statistic on the resulting data.  The problem I'm
running into is that when I'm calculating the statistics via a function,
it's twice as slow as when I'm calculating the statistics directly in my
query.  I want to be able to use a function, since I'll be using this
particular calculation in many places.

Any idea of what's going on?  Below, I've included my function, and both
queries (I removed the type_ids, and just wrote …ids…

Here's my function (I also tried stable):
CREATE OR REPLACE FUNCTION calc_test(a double precision, b integer, c
integer)
RETURNS double precision AS $$
BEGIN
return a/b/c* 10::double precision;
END;
$$ LANGUAGE plpgsql immutable;


The query that takes 7.6 seconds, when I calculate the statistic from within
the query:
explain analyze
select
   agg.primary_id,
   avg(agg.a / agg.b / agg.c * 10::double precision) foo,
   stddev(agg.a / agg.b / agg.c * 10::double precision) bar
from mytable agg
where agg.type_id in (ids)
group by agg.primary_id;

The execution plan:
 HashAggregate  (cost=350380.58..350776.10 rows=9888 width=20) (actual
time=7300.414..7331.659 rows=20993 loops=1)
   -  Bitmap Heap Scan on mytable agg  (cost=28667.90..337509.63
rows=1716127 width=20) (actual time=200.064..2861.600 rows=2309230 loops=1)
 Recheck Cond: (type_id = ANY ('{ids}'::integer[]))
 -  Bitmap Index Scan on mytable_type_id_idx  (cost=0.00..28238.87
rows=1716127 width=0) (actual time=192.725..192.725 rows=2309230 loops=1)
   Index Cond: (type_id = ANY ('{ids}'::integer[]))
 Total runtime: 7358.337 ms
(6 rows)




The same query, but now I'm calling the function.  When I call the function
it's taking 15.5 seconds.
explain analyze select
   agg.primary_id,
   avg(calc_test(agg.a,agg.b,agg.c)) foo,
   stddev(calc_test(agg.a,agg.b,agg.c)) bar
from mytable agg
where agg.type_id in (ids)
group by agg.primary_id;

and, here's the execution plan:

 HashAggregate  (cost=350380.58..355472.90 rows=9888 width=20) (actual
time=13660.838..13686.618 rows=20993 loops=1)
   -  Bitmap Heap Scan on mytable agg  (cost=28667.90..337509.63
rows=1716127 width=20) (actual time=170.385..2881.122 rows=2309230 loops=1)
 Recheck Cond: (type_id = ANY ('{ids}'::integer[]))
 -  Bitmap Index Scan on mytable_type_id_idx  (cost=0.00..28238.87
rows=1716127 width=0) (actual time=162.834..162.834 rows=2309230 loops=1)
   Index Cond: (type_id = ANY ('{ids}'::integer[]))
 Total runtime: 13707.560 ms


Thanks!

Anish


[PERFORM] Raid 5 vs Raid 10 Benchmarks Using bonnie++

2011-08-17 Thread Ogden
I am using bonnie++ to benchmark our current Postgres system (on RAID 5) with 
the new one we have, which I have configured with RAID 10. The drives are the 
same (SAS 15K). I tried the new system with ext3 and then XFS but the results 
seem really outrageous as compared to the current system, or am I reading 
things wrong?

The benchmark results are here:

http://malekkoheavyindustry.com/benchmark.html


Thank you

Ogden

Re: [PERFORM] Calculating statistic via function rather than with query is slowing my query

2011-08-17 Thread Pavel Stehule
Hello

2011/8/17 Anish Kejariwal anish...@gmail.com:
 Hi everyone,
 I'm using postgres 9.0.3, and here's the OS I'm running this on:
 Linux 2.6.18-238.12.1.el5xen #1 SMP Tue May 31 14:02:29 EDT 2011 x86_64
 x86_64 x86_64 GNU/Linux
 I have a fairly straight forward query.  I'm doing a group by on an ID, and
 then calculating some a statistic on the resulting data.  The problem I'm
 running into is that when I'm calculating the statistics via a function,
 it's twice as slow as when I'm calculating the statistics directly in my
 query.  I want to be able to use a function, since I'll be using this
 particular calculation in many places.
 Any idea of what's going on?  Below, I've included my function, and both
 queries (I removed the type_ids, and just wrote …ids…
 Here's my function (I also tried stable):
 CREATE OR REPLACE FUNCTION calc_test(a double precision, b integer, c
 integer)
 RETURNS double precision AS $$
 BEGIN
         return a/b/c* 10::double precision;
 END;
 $$ LANGUAGE plpgsql immutable;


this is overhead of plpgsql call. For this simple functions use a SQL
functions instead

CREATE OR REPLACE FUNCTION calc_test(a double precision, b integer, c
 integer)
 RETURNS double precision AS $$
 SELECT $1/$2/$3* 10::double precision;
 $$ LANGUAGE sql;

Regards

Pavel Stehule

 The query that takes 7.6 seconds, when I calculate the statistic from within
 the query:
 explain analyze
 select
    agg.primary_id,
    avg(agg.a / agg.b / agg.c * 10::double precision) foo,
    stddev(agg.a / agg.b / agg.c * 10::double precision) bar
 from mytable agg
 where agg.type_id in (ids)
 group by agg.primary_id;
 The execution plan:
  HashAggregate  (cost=350380.58..350776.10 rows=9888 width=20) (actual
 time=7300.414..7331.659 rows=20993 loops=1)
    -  Bitmap Heap Scan on mytable agg  (cost=28667.90..337509.63
 rows=1716127 width=20) (actual time=200.064..2861.600 rows=2309230 loops=1)
          Recheck Cond: (type_id = ANY ('{ids}'::integer[]))
          -  Bitmap Index Scan on mytable_type_id_idx  (cost=0.00..28238.87
 rows=1716127 width=0) (actual time=192.725..192.725 rows=2309230 loops=1)
                Index Cond: (type_id = ANY ('{ids}'::integer[]))
  Total runtime: 7358.337 ms
 (6 rows)



 The same query, but now I'm calling the function.  When I call the function
 it's taking 15.5 seconds.
 explain analyze select
    agg.primary_id,
    avg(calc_test(agg.a,agg.b,agg.c)) foo,
    stddev(calc_test(agg.a,agg.b,agg.c)) bar
 from mytable agg
 where agg.type_id in (ids)
 group by agg.primary_id;
 and, here's the execution plan:
  HashAggregate  (cost=350380.58..355472.90 rows=9888 width=20) (actual
 time=13660.838..13686.618 rows=20993 loops=1)
    -  Bitmap Heap Scan on mytable agg  (cost=28667.90..337509.63
 rows=1716127 width=20) (actual time=170.385..2881.122 rows=2309230 loops=1)
          Recheck Cond: (type_id = ANY ('{ids}'::integer[]))
          -  Bitmap Index Scan on mytable_type_id_idx  (cost=0.00..28238.87
 rows=1716127 width=0) (actual time=162.834..162.834 rows=2309230 loops=1)
                Index Cond: (type_id = ANY ('{ids}'::integer[]))
  Total runtime: 13707.560 ms

 Thanks!
 Anish

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Raid 5 vs Raid 10 Benchmarks Using bonnie++

2011-08-17 Thread k...@rice.edu
On Wed, Aug 17, 2011 at 01:26:56PM -0500, Ogden wrote:
 I am using bonnie++ to benchmark our current Postgres system (on RAID 5) with 
 the new one we have, which I have configured with RAID 10. The drives are the 
 same (SAS 15K). I tried the new system with ext3 and then XFS but the results 
 seem really outrageous as compared to the current system, or am I reading 
 things wrong?
 
 The benchmark results are here:
 
 http://malekkoheavyindustry.com/benchmark.html
 
 
 Thank you
 
 Ogden

That looks pretty normal to me.

Ken

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Raid 5 vs Raid 10 Benchmarks Using bonnie++

2011-08-17 Thread Ogden

On Aug 17, 2011, at 1:31 PM, k...@rice.edu wrote:

 On Wed, Aug 17, 2011 at 01:26:56PM -0500, Ogden wrote:
 I am using bonnie++ to benchmark our current Postgres system (on RAID 5) 
 with the new one we have, which I have configured with RAID 10. The drives 
 are the same (SAS 15K). I tried the new system with ext3 and then XFS but 
 the results seem really outrageous as compared to the current system, or am 
 I reading things wrong?
 
 The benchmark results are here:
 
 http://malekkoheavyindustry.com/benchmark.html
 
 
 Thank you
 
 Ogden
 
 That looks pretty normal to me.
 
 Ken

But such a jump from the current db01 system to this? Over 20 times difference 
from the current system to the new one with XFS. Is that much of a jump normal?

Ogden
-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Raid 5 vs Raid 10 Benchmarks Using bonnie++

2011-08-17 Thread k...@rice.edu
On Wed, Aug 17, 2011 at 01:32:41PM -0500, Ogden wrote:
 
 On Aug 17, 2011, at 1:31 PM, k...@rice.edu wrote:
 
  On Wed, Aug 17, 2011 at 01:26:56PM -0500, Ogden wrote:
  I am using bonnie++ to benchmark our current Postgres system (on RAID 5) 
  with the new one we have, which I have configured with RAID 10. The drives 
  are the same (SAS 15K). I tried the new system with ext3 and then XFS but 
  the results seem really outrageous as compared to the current system, or 
  am I reading things wrong?
  
  The benchmark results are here:
  
  http://malekkoheavyindustry.com/benchmark.html
  
  
  Thank you
  
  Ogden
  
  That looks pretty normal to me.
  
  Ken
 
 But such a jump from the current db01 system to this? Over 20 times 
 difference from the current system to the new one with XFS. Is that much of a 
 jump normal?
 
 Ogden

Yes, RAID5 is bad for in many ways. XFS is much better than EXT3. You would get 
similar
results with EXT4 as well, I suspect, although you did not test that.

Regards,
Ken

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Raid 5 vs Raid 10 Benchmarks Using bonnie++

2011-08-17 Thread Gary Doades

On 17/08/2011 7:26 PM, Ogden wrote:
I am using bonnie++ to benchmark our current Postgres system (on RAID 
5) with the new one we have, which I have configured with RAID 10. The 
drives are the same (SAS 15K). I tried the new system with ext3 and 
then XFS but the results seem really outrageous as compared to the 
current system, or am I reading things wrong?


The benchmark results are here:

http://malekkoheavyindustry.com/benchmark.html

The results are not completely outrageous, however you don't say what 
drives, how many and what RAID controller you have in the current and 
new systems. You might expect that performance from 10/12 disks in RAID 
10 with a good controller. I would say that your current system is 
outrageous in that is is so slow!


Cheers,
Gary.

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Raid 5 vs Raid 10 Benchmarks Using bonnie++

2011-08-17 Thread Andy Colson

On 8/17/2011 1:35 PM, k...@rice.edu wrote:

On Wed, Aug 17, 2011 at 01:32:41PM -0500, Ogden wrote:


On Aug 17, 2011, at 1:31 PM, k...@rice.edu wrote:


On Wed, Aug 17, 2011 at 01:26:56PM -0500, Ogden wrote:

I am using bonnie++ to benchmark our current Postgres system (on RAID 5) with 
the new one we have, which I have configured with RAID 10. The drives are the 
same (SAS 15K). I tried the new system with ext3 and then XFS but the results 
seem really outrageous as compared to the current system, or am I reading 
things wrong?

The benchmark results are here:

http://malekkoheavyindustry.com/benchmark.html


Thank you

Ogden


That looks pretty normal to me.

Ken


But such a jump from the current db01 system to this? Over 20 times difference 
from the current system to the new one with XFS. Is that much of a jump normal?

Ogden


Yes, RAID5 is bad for in many ways. XFS is much better than EXT3. You would get 
similar
results with EXT4 as well, I suspect, although you did not test that.

Regards,
Ken



A while back I tested ext3 and xfs myself and found xfs performs better 
for PG.  However, I also have a photos site with 100K files (split into 
a small subset of directories), and xfs sucks bad on it.


So my db is on xfs, and my photos are on ext4.

The numbers between raid5 and raid10 dont really surprise me either.  I 
went from 100 Meg/sec to 230 Meg/sec going from 3 disk raid 5 to 4 disk 
raid 10.  (I'm, of course, using SATA drives with 4 gig of ram... 
and 2 cores.  Everyone with more than 8 cores and 64 gig of ram is off 
my Christmas list! :-) )


-Andy

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Raid 5 vs Raid 10 Benchmarks Using bonnie++

2011-08-17 Thread Ogden

On Aug 17, 2011, at 1:48 PM, Andy Colson wrote:

 On 8/17/2011 1:35 PM, k...@rice.edu wrote:
 On Wed, Aug 17, 2011 at 01:32:41PM -0500, Ogden wrote:
 
 On Aug 17, 2011, at 1:31 PM, k...@rice.edu wrote:
 
 On Wed, Aug 17, 2011 at 01:26:56PM -0500, Ogden wrote:
 I am using bonnie++ to benchmark our current Postgres system (on RAID 5) 
 with the new one we have, which I have configured with RAID 10. The 
 drives are the same (SAS 15K). I tried the new system with ext3 and then 
 XFS but the results seem really outrageous as compared to the current 
 system, or am I reading things wrong?
 
 The benchmark results are here:
 
 http://malekkoheavyindustry.com/benchmark.html
 
 
 Thank you
 
 Ogden
 
 That looks pretty normal to me.
 
 Ken
 
 But such a jump from the current db01 system to this? Over 20 times 
 difference from the current system to the new one with XFS. Is that much of 
 a jump normal?
 
 Ogden
 
 Yes, RAID5 is bad for in many ways. XFS is much better than EXT3. You would 
 get similar
 results with EXT4 as well, I suspect, although you did not test that.
 
 Regards,
 Ken
 
 
 A while back I tested ext3 and xfs myself and found xfs performs better for 
 PG.  However, I also have a photos site with 100K files (split into a small 
 subset of directories), and xfs sucks bad on it.
 
 So my db is on xfs, and my photos are on ext4.


What about the OS itself? I put the Debian linux sysem also on XFS but haven't 
played around with it too much. Is it better to put the OS itself on ext4 and 
the /var/lib/pgsql partition on XFS?

Thanks

Ogden
-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Raid 5 vs Raid 10 Benchmarks Using bonnie++

2011-08-17 Thread Ogden

On Aug 17, 2011, at 1:33 PM, Gary Doades wrote:

 On 17/08/2011 7:26 PM, Ogden wrote:
 I am using bonnie++ to benchmark our current Postgres system (on RAID 5) 
 with the new one we have, which I have configured with RAID 10. The drives 
 are the same (SAS 15K). I tried the new system with ext3 and then XFS but 
 the results seem really outrageous as compared to the current system, or am 
 I reading things wrong?
 
 The benchmark results are here:
 
 http://malekkoheavyindustry.com/benchmark.html
 
 The results are not completely outrageous, however you don't say what drives, 
 how many and what RAID controller you have in the current and new systems. 
 You might expect that performance from 10/12 disks in RAID 10 with a good 
 controller. I would say that your current system is outrageous in that is is 
 so slow!
 
 Cheers,
 Gary.


Yes, under heavy writes the load would shoot right up which is what caused us 
to look at upgrading. If it is the RAID 5, it is mind boggling that it could be 
that much of a difference. I expected a difference, now that much. 

The new system has 6 drives, 300Gb 15K SAS and I've put them into a RAID 10 
configuration. The current system is ext3 with RAID 5 over 4 disks on a Perc/5i 
controller which has half the write cache as the new one (256 Mb vs 512Mb). 

Ogden
-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Tuning Tips for a new Server

2011-08-17 Thread Tomas Vondra
On 17 Srpen 2011, 18:39, Ogden wrote:
 Yes, but it greatly depends on the amount of WAL and your workload. If
 you
 need to write a lot of WAL data (e.g. during bulk loading), this may
 significantly improve performance. It may also help when you have a
 write-heavy workload (a lot of clients updating records, background
 writer
 etc.) as that usually means a lot of seeking (while WAL is written
 sequentially).

 The database is about 200Gb so using /usr/local/pgsql/pg_xlog on a virtual
 disk with 100Gb should not be a problem with the disk space should it?

I think you've mentioned the database is on 6 drives, while the other
volume is on 2 drives, right? That makes the OS drive about 3x slower
(just a rough estimate). But if the database drive is used heavily, it
might help to move the xlog directory to the OS disk. See how is the db
volume utilized and if it's fully utilized, try to move the xlog
directory.

The only way to find out is to actualy try it with your workload.

 What is the size of the database? So those are the new servers? What's
 the difference compared to the old ones? What is the RAID controller, how
 much write cache is there?

 I am sorry I overlooked specifying this. The database is about 200Gb and
 yes these are new servers which bring more power (RAM, CPU) over the last
 one. The RAID Controller is a Perc H700 and there is 512Mb write cache.
 The servers are Dells.

OK, sounds good although I don't have much experience with this controller.

 Currently, I have this for the current system which as 16Gb Ram:

 max_connections = 350

 work_mem = 32MB
 maintenance_work_mem = 512MB
 wal_buffers = 640kB

 Anyway if this worked fine for your workload, I don't think you need to
 change those settings. I'd probably bump up the wal_buffers to 16MB - it
 might help a bit, definitely won't hurt and it's so little memory it's
 not
 worth the effort I guess.

 So just increasing the wal_buffers is okay? I thought there would be more
 as the memory in the system is now 4 times as much. Perhaps shared_buffers
 too (down below).

Yes, I was just commenting that particular piece of config. Shared buffers
should be increased too.

 # This is what I was helped with before and made reporting queries
 blaze
 by
 seq_page_cost = 1.0
 random_page_cost = 3.0
 cpu_tuple_cost = 0.5
 effective_cache_size = 8192MB

 Are you sure the cpu_tuple_cost = 0.5 is correct? That seems a bit crazy
 to me, as it says reading a page sequentially is just twice as expensive
 as processing it. This value should be abou 100x lower or something like
 that.

 These settings are for the old server, keep in mind. It's a 16GB machine
 (the new one is 64Gb). The value for cpu_tuple_cost should be 0.005? How
 are the other ones?

The default values are like this:

seq_page_cost = 1.0
random_page_cost = 4.0
cpu_tuple_cost = 0.01
cpu_index_tuple_cost = 0.005
cpu_operator_cost = 0.0025

Increasing the cpu_tuple_cost to 0.5 makes it way too expensive I guess,
so the database believes processing two 8kB pages is just as expensive as
reading one from the disk. I guess this change penalizes plans that read a
lot of pages, e.g. sequential scans (and favor index scans etc.). Maybe it
makes sense in your case, I'm just wondering why you set it like that.

 What are the checkpoint settings (segments, completion target). What
 about
 shared buffers?


 #checkpoint_segments = 3# in logfile segments, min 1, 16MB
 each
 #checkpoint_timeout = 5min  # range 30s-1h
 checkpoint_completion_target = 0.9  # checkpoint target duration, 0.0
 - 1.0 - was 0.5
 #checkpoint_warning = 30s   # 0 disables

You need to bump checkpoint segments up, e.g. 64 or maybe even more. This
means how many WAL segments will be available until a checkpoint has to
happen. Checkpoint is a process when dirty buffers from shared buffers are
written to the disk, so it may be very I/O intensive. Each segment is
16MB, so 3 segments is just 48MB of data, while 64 is 1GB.

More checkpoint segments result in longer recovery in case of database
crash (because all the segments since last checkpoint need to be applied).
But it's essential for good write performance.

Completion target seems fine, but I'd consider increasing the timeout too.

 shared_buffers = 4096MB

The usual recommendation is about 25% of RAM for shared buffers, with 64GB
of RAM that is 16GB. And you should increase effective_cache_size too.

See this: http://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server

Tomas


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Calculating statistic via function rather than with query is slowing my query

2011-08-17 Thread Anish Kejariwal
Thanks Pavel! that definitely solved it.

Unfortunately, the function I gave you was a simple/short version of what
the actual function is going to be.  The actual function is going to get
parameters passed to it, and based on the parameters will go through some
if...else conditions, and maybe even call another function.  Based on that,
I was definitely hoping to use plpgsql, and the overhead is unfortunate.

Is there any way to get around this overhead?  Will I still have the same
overhead if I use plperl, plpython, pljava, or write the function in C?

Anish


On Wed, Aug 17, 2011 at 11:27 AM, Pavel Stehule pavel.steh...@gmail.comwrote:

 Hello

 2011/8/17 Anish Kejariwal anish...@gmail.com:
  Hi everyone,
  I'm using postgres 9.0.3, and here's the OS I'm running this on:
  Linux 2.6.18-238.12.1.el5xen #1 SMP Tue May 31 14:02:29 EDT 2011 x86_64
  x86_64 x86_64 GNU/Linux
  I have a fairly straight forward query.  I'm doing a group by on an ID,
 and
  then calculating some a statistic on the resulting data.  The problem I'm
  running into is that when I'm calculating the statistics via a function,
  it's twice as slow as when I'm calculating the statistics directly in my
  query.  I want to be able to use a function, since I'll be using this
  particular calculation in many places.
  Any idea of what's going on?  Below, I've included my function, and both
  queries (I removed the type_ids, and just wrote …ids…
  Here's my function (I also tried stable):
  CREATE OR REPLACE FUNCTION calc_test(a double precision, b integer, c
  integer)
  RETURNS double precision AS $$
  BEGIN
  return a/b/c* 10::double precision;
  END;
  $$ LANGUAGE plpgsql immutable;
 

 this is overhead of plpgsql call. For this simple functions use a SQL
 functions instead

 CREATE OR REPLACE FUNCTION calc_test(a double precision, b integer, c
  integer)
  RETURNS double precision AS $$
  SELECT $1/$2/$3* 10::double precision;
  $$ LANGUAGE sql;

 Regards

 Pavel Stehule

  The query that takes 7.6 seconds, when I calculate the statistic from
 within
  the query:
  explain analyze
  select
 agg.primary_id,
 avg(agg.a / agg.b / agg.c * 10::double precision) foo,
 stddev(agg.a / agg.b / agg.c * 10::double precision) bar
  from mytable agg
  where agg.type_id in (ids)
  group by agg.primary_id;
  The execution plan:
   HashAggregate  (cost=350380.58..350776.10 rows=9888 width=20) (actual
  time=7300.414..7331.659 rows=20993 loops=1)
 -  Bitmap Heap Scan on mytable agg  (cost=28667.90..337509.63
  rows=1716127 width=20) (actual time=200.064..2861.600 rows=2309230
 loops=1)
   Recheck Cond: (type_id = ANY ('{ids}'::integer[]))
   -  Bitmap Index Scan on mytable_type_id_idx
  (cost=0.00..28238.87
  rows=1716127 width=0) (actual time=192.725..192.725 rows=2309230 loops=1)
 Index Cond: (type_id = ANY ('{ids}'::integer[]))
   Total runtime: 7358.337 ms
  (6 rows)
 
 
 
  The same query, but now I'm calling the function.  When I call the
 function
  it's taking 15.5 seconds.
  explain analyze select
 agg.primary_id,
 avg(calc_test(agg.a,agg.b,agg.c)) foo,
 stddev(calc_test(agg.a,agg.b,agg.c)) bar
  from mytable agg
  where agg.type_id in (ids)
  group by agg.primary_id;
  and, here's the execution plan:
   HashAggregate  (cost=350380.58..355472.90 rows=9888 width=20) (actual
  time=13660.838..13686.618 rows=20993 loops=1)
 -  Bitmap Heap Scan on mytable agg  (cost=28667.90..337509.63
  rows=1716127 width=20) (actual time=170.385..2881.122 rows=2309230
 loops=1)
   Recheck Cond: (type_id = ANY ('{ids}'::integer[]))
   -  Bitmap Index Scan on mytable_type_id_idx
  (cost=0.00..28238.87
  rows=1716127 width=0) (actual time=162.834..162.834 rows=2309230 loops=1)
 Index Cond: (type_id = ANY ('{ids}'::integer[]))
   Total runtime: 13707.560 ms
 
  Thanks!
  Anish



Re: [PERFORM] Raid 5 vs Raid 10 Benchmarks Using bonnie++

2011-08-17 Thread Gary Doades

On 17/08/2011 7:56 PM, Ogden wrote:

On Aug 17, 2011, at 1:33 PM, Gary Doades wrote:


On 17/08/2011 7:26 PM, Ogden wrote:

I am using bonnie++ to benchmark our current Postgres system (on RAID 5) with 
the new one we have, which I have configured with RAID 10. The drives are the 
same (SAS 15K). I tried the new system with ext3 and then XFS but the results 
seem really outrageous as compared to the current system, or am I reading 
things wrong?

The benchmark results are here:

http://malekkoheavyindustry.com/benchmark.html


The results are not completely outrageous, however you don't say what drives, 
how many and what RAID controller you have in the current and new systems. You 
might expect that performance from 10/12 disks in RAID 10 with a good 
controller. I would say that your current system is outrageous in that is is so 
slow!

Cheers,
Gary.


Yes, under heavy writes the load would shoot right up which is what caused us 
to look at upgrading. If it is the RAID 5, it is mind boggling that it could be 
that much of a difference. I expected a difference, now that much.

The new system has 6 drives, 300Gb 15K SAS and I've put them into a RAID 10 
configuration. The current system is ext3 with RAID 5 over 4 disks on a Perc/5i 
controller which has half the write cache as the new one (256 Mb vs 512Mb).
Hmm... for only 6 disks in RAID 10 I would say that the figures are a 
bit higher than I would expect. The PERC 5 controller is pretty poor in 
my opinion, PERC 6 a lot better and the new H700's pretty good. I'm 
guessing you have a H700 in your new system.


I've just got a Dell 515 with a H700 and 8 SAS in RAID 10 and I only get 
around 600 MB/s read using ext4 and Ubuntu 10.4 server.


Like I say, your figures are not outrageous, just unexpectedly good :)

Cheers,
Gary.

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Calculating statistic via function rather than with query is slowing my query

2011-08-17 Thread Pavel Stehule
2011/8/17 Anish Kejariwal anish...@gmail.com:
 Thanks Pavel! that definitely solved it.
 Unfortunately, the function I gave you was a simple/short version of what
 the actual function is going to be.  The actual function is going to get
 parameters passed to it, and based on the parameters will go through some
 if...else conditions, and maybe even call another function.  Based on that,
 I was definitely hoping to use plpgsql, and the overhead is unfortunate.
 Is there any way to get around this overhead?  Will I still have the same
 overhead if I use plperl, plpython, pljava, or write the function in C?

only SQL and C has zero overhead - SQL because uses inlining and C is
just readable assambler.

I am thinking,  overhead of PL/pgSQL is minimal from languages from your list.

Regards

Pavel


 Anish

 On Wed, Aug 17, 2011 at 11:27 AM, Pavel Stehule pavel.steh...@gmail.com
 wrote:

 Hello

 2011/8/17 Anish Kejariwal anish...@gmail.com:
  Hi everyone,
  I'm using postgres 9.0.3, and here's the OS I'm running this on:
  Linux 2.6.18-238.12.1.el5xen #1 SMP Tue May 31 14:02:29 EDT 2011 x86_64
  x86_64 x86_64 GNU/Linux
  I have a fairly straight forward query.  I'm doing a group by on an ID,
  and
  then calculating some a statistic on the resulting data.  The problem
  I'm
  running into is that when I'm calculating the statistics via a function,
  it's twice as slow as when I'm calculating the statistics directly in my
  query.  I want to be able to use a function, since I'll be using this
  particular calculation in many places.
  Any idea of what's going on?  Below, I've included my function, and both
  queries (I removed the type_ids, and just wrote …ids…
  Here's my function (I also tried stable):
  CREATE OR REPLACE FUNCTION calc_test(a double precision, b integer, c
  integer)
  RETURNS double precision AS $$
  BEGIN
          return a/b/c* 10::double precision;
  END;
  $$ LANGUAGE plpgsql immutable;
 

 this is overhead of plpgsql call. For this simple functions use a SQL
 functions instead

 CREATE OR REPLACE FUNCTION calc_test(a double precision, b integer, c
  integer)
  RETURNS double precision AS $$
          SELECT $1/$2/$3* 10::double precision;
  $$ LANGUAGE sql;

 Regards

 Pavel Stehule

  The query that takes 7.6 seconds, when I calculate the statistic from
  within
  the query:
  explain analyze
  select
     agg.primary_id,
     avg(agg.a / agg.b / agg.c * 10::double precision) foo,
     stddev(agg.a / agg.b / agg.c * 10::double precision) bar
  from mytable agg
  where agg.type_id in (ids)
  group by agg.primary_id;
  The execution plan:
   HashAggregate  (cost=350380.58..350776.10 rows=9888 width=20) (actual
  time=7300.414..7331.659 rows=20993 loops=1)
     -  Bitmap Heap Scan on mytable agg  (cost=28667.90..337509.63
  rows=1716127 width=20) (actual time=200.064..2861.600 rows=2309230
  loops=1)
           Recheck Cond: (type_id = ANY ('{ids}'::integer[]))
           -  Bitmap Index Scan on mytable_type_id_idx
   (cost=0.00..28238.87
  rows=1716127 width=0) (actual time=192.725..192.725 rows=2309230
  loops=1)
                 Index Cond: (type_id = ANY ('{ids}'::integer[]))
   Total runtime: 7358.337 ms
  (6 rows)
 
 
 
  The same query, but now I'm calling the function.  When I call the
  function
  it's taking 15.5 seconds.
  explain analyze select
     agg.primary_id,
     avg(calc_test(agg.a,agg.b,agg.c)) foo,
     stddev(calc_test(agg.a,agg.b,agg.c)) bar
  from mytable agg
  where agg.type_id in (ids)
  group by agg.primary_id;
  and, here's the execution plan:
   HashAggregate  (cost=350380.58..355472.90 rows=9888 width=20) (actual
  time=13660.838..13686.618 rows=20993 loops=1)
     -  Bitmap Heap Scan on mytable agg  (cost=28667.90..337509.63
  rows=1716127 width=20) (actual time=170.385..2881.122 rows=2309230
  loops=1)
           Recheck Cond: (type_id = ANY ('{ids}'::integer[]))
           -  Bitmap Index Scan on mytable_type_id_idx
   (cost=0.00..28238.87
  rows=1716127 width=0) (actual time=162.834..162.834 rows=2309230
  loops=1)
                 Index Cond: (type_id = ANY ('{ids}'::integer[]))
   Total runtime: 13707.560 ms
 
  Thanks!
  Anish



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Tuning Tips for a new Server

2011-08-17 Thread Ogden

On Aug 17, 2011, at 1:56 PM, Tomas Vondra wrote:

 On 17 Srpen 2011, 18:39, Ogden wrote:
 Yes, but it greatly depends on the amount of WAL and your workload. If
 you
 need to write a lot of WAL data (e.g. during bulk loading), this may
 significantly improve performance. It may also help when you have a
 write-heavy workload (a lot of clients updating records, background
 writer
 etc.) as that usually means a lot of seeking (while WAL is written
 sequentially).
 
 The database is about 200Gb so using /usr/local/pgsql/pg_xlog on a virtual
 disk with 100Gb should not be a problem with the disk space should it?
 
 I think you've mentioned the database is on 6 drives, while the other
 volume is on 2 drives, right? That makes the OS drive about 3x slower
 (just a rough estimate). But if the database drive is used heavily, it
 might help to move the xlog directory to the OS disk. See how is the db
 volume utilized and if it's fully utilized, try to move the xlog
 directory.
 
 The only way to find out is to actualy try it with your workload.

Thank you for your help. I just wanted to ask then, for now I should also put 
the xlog directory in the /var/lib/pgsql directory which is on the RAID 
container that is over 6 drives. You see, I wanted to put it on the container 
with the 2 drives because just the OS is installed on it and has the space 
(about 100Gb free). 

But you don't think it will be a problem to put the xlog directory along with 
everything else on /var/lib/pgsql/data? I had seen someone suggesting 
separating it for their setup and it sounded like a good idea so I thought why 
not, but in retrospect and what you are saying with the OS drives being 3x 
slower, it may be okay just to put them on the 6 drives. 

Thoughts?

Thank you once again for your tremendous help

Ogden
-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Raid 5 vs Raid 10 Benchmarks Using bonnie++

2011-08-17 Thread Andy Colson

On 8/17/2011 1:55 PM, Ogden wrote:


On Aug 17, 2011, at 1:48 PM, Andy Colson wrote:


On 8/17/2011 1:35 PM, k...@rice.edu wrote:

On Wed, Aug 17, 2011 at 01:32:41PM -0500, Ogden wrote:


On Aug 17, 2011, at 1:31 PM, k...@rice.edu wrote:


On Wed, Aug 17, 2011 at 01:26:56PM -0500, Ogden wrote:

I am using bonnie++ to benchmark our current Postgres system (on RAID 5) with 
the new one we have, which I have configured with RAID 10. The drives are the 
same (SAS 15K). I tried the new system with ext3 and then XFS but the results 
seem really outrageous as compared to the current system, or am I reading 
things wrong?

The benchmark results are here:

http://malekkoheavyindustry.com/benchmark.html


Thank you

Ogden


That looks pretty normal to me.

Ken


But such a jump from the current db01 system to this? Over 20 times difference 
from the current system to the new one with XFS. Is that much of a jump normal?

Ogden


Yes, RAID5 is bad for in many ways. XFS is much better than EXT3. You would get 
similar
results with EXT4 as well, I suspect, although you did not test that.

Regards,
Ken



A while back I tested ext3 and xfs myself and found xfs performs better for PG. 
 However, I also have a photos site with 100K files (split into a small subset 
of directories), and xfs sucks bad on it.

So my db is on xfs, and my photos are on ext4.



What about the OS itself? I put the Debian linux sysem also on XFS but haven't 
played around with it too much. Is it better to put the OS itself on ext4 and 
the /var/lib/pgsql partition on XFS?

Thanks

Ogden


I doubt it matters.  The OS is not going to batch delete thousands of 
files.  Once its setup, its pretty constant.  I would not worry about it.


-Andy

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Tuning Tips for a new Server

2011-08-17 Thread Scott Marlowe
On Wed, Aug 17, 2011 at 12:56 PM, Tomas Vondra t...@fuzzy.cz wrote:

 I think you've mentioned the database is on 6 drives, while the other
 volume is on 2 drives, right? That makes the OS drive about 3x slower
 (just a rough estimate). But if the database drive is used heavily, it
 might help to move the xlog directory to the OS disk. See how is the db
 volume utilized and if it's fully utilized, try to move the xlog
 directory.

 The only way to find out is to actualy try it with your workload.

This is a very important point.  I've found on most machines with
hardware caching RAID and  8 or fewer 15k SCSI drives it's just as
fast to put it all on one big RAID-10 and if necessary partition it to
put the pg_xlog on its own file system.  After that depending on the
workload you might need a LOT of drives in the pg_xlog dir or just a
pair.Under normal ops many dbs will use only a tiny % of a
dedicated pg_xlog.  Then something like a site indexer starts to run,
and writing heavily to the db, and the usage shoots to 100% and it's
the bottleneck.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Raid 5 vs Raid 10 Benchmarks Using bonnie++

2011-08-17 Thread J Sisson
On Wed, Aug 17, 2011 at 1:55 PM, Ogden li...@darkstatic.com wrote:



 What about the OS itself? I put the Debian linux sysem also on XFS but
 haven't played around with it too much. Is it better to put the OS itself on
 ext4 and the /var/lib/pgsql partition on XFS?


We've always put the OS on whatever default filesystem it uses, and then put
PGDATA on a RAID 10/XFS and PGXLOG on RAID 1/XFS (and for our larger
installations, we setup another RAID 10/XFS for heavily accessed indexes or
tables).  If you have a battery-backed cache on your controller (and it's
been tested to work), you can increase performance by mounting the XFS
partitions with nobarrier...just make sure your battery backup works.

I don't know how current this information is for 9.x (we're still on 8.4),
but there is (used to be?) a threshold above which more shared_buffers
didn't help.  The numbers vary, but somewhere between 8 and 16 GB is
typically quoted.  We set ours to 25% RAM, but no more than 12 GB (even for
our machines with 128+ GB of RAM) because that seems to be a breaking point
for our workload.

Of course, no advice will take the place of testing with your workload, so
be sure to test =)


Re: [PERFORM] Tuning Tips for a new Server

2011-08-17 Thread Ogden

On Aug 17, 2011, at 2:14 PM, Scott Marlowe wrote:

 On Wed, Aug 17, 2011 at 12:56 PM, Tomas Vondra t...@fuzzy.cz wrote:
 
 I think you've mentioned the database is on 6 drives, while the other
 volume is on 2 drives, right? That makes the OS drive about 3x slower
 (just a rough estimate). But if the database drive is used heavily, it
 might help to move the xlog directory to the OS disk. See how is the db
 volume utilized and if it's fully utilized, try to move the xlog
 directory.
 
 The only way to find out is to actualy try it with your workload.
 
 This is a very important point.  I've found on most machines with
 hardware caching RAID and  8 or fewer 15k SCSI drives it's just as
 fast to put it all on one big RAID-10 and if necessary partition it to
 put the pg_xlog on its own file system.  After that depending on the
 workload you might need a LOT of drives in the pg_xlog dir or just a
 pair.Under normal ops many dbs will use only a tiny % of a
 dedicated pg_xlog.  Then something like a site indexer starts to run,
 and writing heavily to the db, and the usage shoots to 100% and it's
 the bottleneck.

I suppose this is my confusion. Or rather I am curious about this. On my 
current production database the pg_xlog directory is 8Gb (our total database is 
200Gb). Does this warrant a totally separate setup (and hardware) than PGDATA?
-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Tuning Tips for a new Server

2011-08-17 Thread Tomas Vondra
On 17 Srpen 2011, 21:22, Ogden wrote:
 This is a very important point.  I've found on most machines with
 hardware caching RAID and  8 or fewer 15k SCSI drives it's just as
 fast to put it all on one big RAID-10 and if necessary partition it to
 put the pg_xlog on its own file system.  After that depending on the
 workload you might need a LOT of drives in the pg_xlog dir or just a
 pair.Under normal ops many dbs will use only a tiny % of a
 dedicated pg_xlog.  Then something like a site indexer starts to run,
 and writing heavily to the db, and the usage shoots to 100% and it's
 the bottleneck.

 I suppose this is my confusion. Or rather I am curious about this. On my
 current production database the pg_xlog directory is 8Gb (our total
 database is 200Gb). Does this warrant a totally separate setup (and
 hardware) than PGDATA?

This is not about database size, it's about the workload - the way you're
using your database. Even a small database may produce a lot of WAL
segments, if the workload is write-heavy. So it's impossible to recommend
something except to try that on your own.

Tomas


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] DBT-5 Postgres 9.0.3

2011-08-17 Thread Tom Lane
k...@rice.edu k...@rice.edu writes:
 On Wed, Aug 17, 2011 at 10:59:12AM -0700, bobbyw wrote:
 Why is psql looking in /tmp?

 Because that is the default location. If you want to change it, you need
 to use the -h commandline option.

It sounds to me like bobbyw might have two separate installations of
postgres (or at least two copies of psql), one compiled with /tmp as the
default socket location and one compiled with /var/run/postgresql as the
default.  /tmp is the out-of-the-box default but I think Debian likes to
build it with /var/run/postgresql as the default.

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Raid 5 vs Raid 10 Benchmarks Using bonnie++

2011-08-17 Thread Ogden

On Aug 17, 2011, at 1:35 PM, k...@rice.edu wrote:

 On Wed, Aug 17, 2011 at 01:32:41PM -0500, Ogden wrote:
 
 On Aug 17, 2011, at 1:31 PM, k...@rice.edu wrote:
 
 On Wed, Aug 17, 2011 at 01:26:56PM -0500, Ogden wrote:
 I am using bonnie++ to benchmark our current Postgres system (on RAID 5) 
 with the new one we have, which I have configured with RAID 10. The drives 
 are the same (SAS 15K). I tried the new system with ext3 and then XFS but 
 the results seem really outrageous as compared to the current system, or 
 am I reading things wrong?
 
 The benchmark results are here:
 
 http://malekkoheavyindustry.com/benchmark.html
 
 
 Thank you
 
 Ogden
 
 That looks pretty normal to me.
 
 Ken
 
 But such a jump from the current db01 system to this? Over 20 times 
 difference from the current system to the new one with XFS. Is that much of 
 a jump normal?
 
 Ogden
 
 Yes, RAID5 is bad for in many ways. XFS is much better than EXT3. You would 
 get similar
 results with EXT4 as well, I suspect, although you did not test that.


i tested ext4 and the results did not seem to be that close to XFS. Especially 
when looking at the Block K/sec for the Sequential Output. 

http://malekkoheavyindustry.com/benchmark.html

So XFS would be best in this case?

Thank you

Ogden

Re: [PERFORM] DBT-5 Postgres 9.0.3

2011-08-17 Thread Aidan Van Dyk
On Wed, Aug 17, 2011 at 4:12 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 It sounds to me like bobbyw might have two separate installations of
 postgres (or at least two copies of psql), one compiled with /tmp as the
 default socket location and one compiled with /var/run/postgresql as the
 default.  /tmp is the out-of-the-box default but I think Debian likes to
 build it with /var/run/postgresql as the default.

It looked like the actual DBT-5 harness is built with system
libraries (libpqxx, linked to system libpq, with debian's
/var/run/postgresql), but the scaffolding around it uses a local
postgres (server and psql) using the source default of /tmp?

a.


-- 
Aidan Van Dyk                                             Create like a god,
ai...@highrise.ca                                       command like a king,
http://www.highrise.ca/                                   work like a slave.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] index not being used when variable is sent

2011-08-17 Thread Jim Nasby
On Aug 17, 2011, at 1:49 AM, Eyal Wilde wrote:
 1. is there any more elegant solution?

Very possibly, but I'm having a heck of a time trying to figure out what your 
current code is actually doing.

What's the actual problem you're trying to solve here?
--
Jim C. Nasby, Database Architect   j...@nasby.net
512.569.9461 (cell) http://jim.nasby.net



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Raid 5 vs Raid 10 Benchmarks Using bonnie++

2011-08-17 Thread k...@rice.edu
On Wed, Aug 17, 2011 at 03:40:03PM -0500, Ogden wrote:
 
 On Aug 17, 2011, at 1:35 PM, k...@rice.edu wrote:
 
  On Wed, Aug 17, 2011 at 01:32:41PM -0500, Ogden wrote:
  
  On Aug 17, 2011, at 1:31 PM, k...@rice.edu wrote:
  
  On Wed, Aug 17, 2011 at 01:26:56PM -0500, Ogden wrote:
  I am using bonnie++ to benchmark our current Postgres system (on RAID 5) 
  with the new one we have, which I have configured with RAID 10. The 
  drives are the same (SAS 15K). I tried the new system with ext3 and then 
  XFS but the results seem really outrageous as compared to the current 
  system, or am I reading things wrong?
  
  The benchmark results are here:
  
  http://malekkoheavyindustry.com/benchmark.html
  
  
  Thank you
  
  Ogden
  
  That looks pretty normal to me.
  
  Ken
  
  But such a jump from the current db01 system to this? Over 20 times 
  difference from the current system to the new one with XFS. Is that much 
  of a jump normal?
  
  Ogden
  
  Yes, RAID5 is bad for in many ways. XFS is much better than EXT3. You would 
  get similar
  results with EXT4 as well, I suspect, although you did not test that.
 
 
 i tested ext4 and the results did not seem to be that close to XFS. 
 Especially when looking at the Block K/sec for the Sequential Output. 
 
 http://malekkoheavyindustry.com/benchmark.html
 
 So XFS would be best in this case?
 
 Thank you
 
 Ogden

It appears so for at least the Bonnie++ benchmark. I would really try to 
benchmark
your actual DB on both EXT4 and XFS because some of the comparative benchmarks 
between
the two give the win to EXT4 for INSERT/UPDATE database usage with PostgreSQL. 
Only
your application will know for sure:)

Ken

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Raid 5 vs Raid 10 Benchmarks Using bonnie++

2011-08-17 Thread Ogden

On Aug 17, 2011, at 3:56 PM, k...@rice.edu wrote:

 On Wed, Aug 17, 2011 at 03:40:03PM -0500, Ogden wrote:
 
 On Aug 17, 2011, at 1:35 PM, k...@rice.edu wrote:
 
 On Wed, Aug 17, 2011 at 01:32:41PM -0500, Ogden wrote:
 
 On Aug 17, 2011, at 1:31 PM, k...@rice.edu wrote:
 
 On Wed, Aug 17, 2011 at 01:26:56PM -0500, Ogden wrote:
 I am using bonnie++ to benchmark our current Postgres system (on RAID 5) 
 with the new one we have, which I have configured with RAID 10. The 
 drives are the same (SAS 15K). I tried the new system with ext3 and then 
 XFS but the results seem really outrageous as compared to the current 
 system, or am I reading things wrong?
 
 The benchmark results are here:
 
 http://malekkoheavyindustry.com/benchmark.html
 
 
 Thank you
 
 Ogden
 
 That looks pretty normal to me.
 
 Ken
 
 But such a jump from the current db01 system to this? Over 20 times 
 difference from the current system to the new one with XFS. Is that much 
 of a jump normal?
 
 Ogden
 
 Yes, RAID5 is bad for in many ways. XFS is much better than EXT3. You would 
 get similar
 results with EXT4 as well, I suspect, although you did not test that.
 
 
 i tested ext4 and the results did not seem to be that close to XFS. 
 Especially when looking at the Block K/sec for the Sequential Output. 
 
 http://malekkoheavyindustry.com/benchmark.html
 
 So XFS would be best in this case?
 
 Thank you
 
 Ogden
 
 It appears so for at least the Bonnie++ benchmark. I would really try to 
 benchmark
 your actual DB on both EXT4 and XFS because some of the comparative 
 benchmarks between
 the two give the win to EXT4 for INSERT/UPDATE database usage with 
 PostgreSQL. Only
 your application will know for sure:)
 
 Ken


What are some good methods that one can use to benchmark PostgreSQL under heavy 
loads? Ie. to emulate heavy writes? Are there any existing scripts and what not?

Thank you

Afra
-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] DBT-5 Postgres 9.0.3

2011-08-17 Thread Tom Lane
Aidan Van Dyk ai...@highrise.ca writes:
 On Wed, Aug 17, 2011 at 4:12 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 It sounds to me like bobbyw might have two separate installations of
 postgres (or at least two copies of psql), one compiled with /tmp as the
 default socket location and one compiled with /var/run/postgresql as the
 default.  /tmp is the out-of-the-box default but I think Debian likes to
 build it with /var/run/postgresql as the default.

 It looked like the actual DBT-5 harness is built with system
 libraries (libpqxx, linked to system libpq, with debian's
 /var/run/postgresql), but the scaffolding around it uses a local
 postgres (server and psql) using the source default of /tmp?

Hmm ... doesn't sound like an amazingly good idea.  But if DBT wants to
do it that way, it'd be well advised to not assume that the system
libraries have either the port number or socket directory defaulting
to what it is using.  Or maybe the problem is that it does override all
that stuff and works fine by itself, but then you can't easily connect
to the server manually?

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Need to tune for Heavy Write

2011-08-17 Thread Jim Nasby
On Aug 4, 2011, at 10:07 AM, Scott Marlowe wrote:
 On Thu, Aug 4, 2011 at 7:57 AM, Kevin Grittner
 kevin.gritt...@wicourts.gov wrote:
 RAM : 16 GB
 
 effective_cache_size = 4096MB
 
 That should probably be more like 12GB to 15GB.  It probably won't
 affect the load time here, but could affect other queries.
 
 Actually on a heavily written database a  large effective cache size
 makes things slower.

effective_cache_size or shared_buffers? I can see why a large shared_buffers 
could cause problems, but what effect does effective_cache_size have on a write 
workload?
--
Jim C. Nasby, Database Architect   j...@nasby.net
512.569.9461 (cell) http://jim.nasby.net



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Raid 5 vs Raid 10 Benchmarks Using bonnie++

2011-08-17 Thread Greg Smith

On 08/17/2011 02:26 PM, Ogden wrote:
I am using bonnie++ to benchmark our current Postgres system (on RAID 
5) with the new one we have, which I have configured with RAID 10. The 
drives are the same (SAS 15K). I tried the new system with ext3 and 
then XFS but the results seem really outrageous as compared to the 
current system, or am I reading things wrong?


The benchmark results are here:
http://malekkoheavyindustry.com/benchmark.html


Congratulations--you're now qualified to be a member of the RAID5 
sucks club.  You can find other members at 
http://www.miracleas.com/BAARF/BAARF2.html  Reasonable read speeds and 
just terrible write ones are expected if that's on your old hardware.  
Your new results are what I would expect from the hardware you've 
described.


The only thing that looks weird are your ext4 Sequential Output - 
Block results.  They should be between the ext3 and the XFS results, 
not far lower than either.  Normally this only comes from using a bad 
set of mount options.  With a battery-backed write cache, you'd want to 
use nobarrier for example; if you didn't do that, that can crush 
output rates.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Calculating statistic via function rather than with query is slowing my query

2011-08-17 Thread Craig Ringer

On 18/08/2011 3:00 AM, Anish Kejariwal wrote:

Thanks Pavel! that definitely solved it.

Unfortunately, the function I gave you was a simple/short version of 
what the actual function is going to be.  The actual function is going 
to get parameters passed to it, and based on the parameters will go 
through some if...else conditions, and maybe even call another 
function.  Based on that, I was definitely hoping to use plpgsql, and 
the overhead is unfortunate.


Is there any way to get around this overhead?  Will I still have the 
same overhead if I use plperl, plpython, pljava, or write the function 
in C?


You can probably still write it as an SQL function if you use CASE WHEN 
appropriately.


--
Craig Ringer


Re: [PERFORM] Raid 5 vs Raid 10 Benchmarks Using bonnie++

2011-08-17 Thread mark


 -Original Message-
 From: pgsql-performance-ow...@postgresql.org [mailto:pgsql-performance-
 ow...@postgresql.org] On Behalf Of Greg Smith
 Sent: Wednesday, August 17, 2011 3:18 PM
 To: pgsql-performance@postgresql.org
 Subject: Re: [PERFORM] Raid 5 vs Raid 10 Benchmarks Using bonnie++
 
 On 08/17/2011 02:26 PM, Ogden wrote:
  I am using bonnie++ to benchmark our current Postgres system (on RAID
  5) with the new one we have, which I have configured with RAID 10.
 The
  drives are the same (SAS 15K). I tried the new system with ext3 and
  then XFS but the results seem really outrageous as compared to the
  current system, or am I reading things wrong?
 
  The benchmark results are here:
  http://malekkoheavyindustry.com/benchmark.html
 
 Congratulations--you're now qualified to be a member of the RAID5
 sucks club.  You can find other members at
 http://www.miracleas.com/BAARF/BAARF2.html  Reasonable read speeds and
 just terrible write ones are expected if that's on your old hardware.
 Your new results are what I would expect from the hardware you've
 described.
 
 The only thing that looks weird are your ext4 Sequential Output -
 Block results.  They should be between the ext3 and the XFS results,
 not far lower than either.  Normally this only comes from using a bad
 set of mount options.  With a battery-backed write cache, you'd want to
 use nobarrier for example; if you didn't do that, that can crush
 output rates.
 

To clarify maybe for those new at using non-default mount options.

With XFS the mount option is nobarrier. With ext4 I think it is barrier=0

Someone please correct me if I am misleading people or otherwise mistaken.

-mark


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Calculating statistic via function rather than with query is slowing my query

2011-08-17 Thread Anish Kejariwal
Thanks for the help Pavel and Craig.  I really appreciate it.  I'm going to
try a couple of these different options (write a c function, use a sql
function with case statements, and use plperl), so I can see which gives me
the realtime performance that I need, and works best for clean code in my
particular case.

thanks!

Anish

On Wed, Aug 17, 2011 at 5:05 PM, Craig Ringer ring...@ringerc.id.au wrote:

  On 18/08/2011 3:00 AM, Anish Kejariwal wrote:

 Thanks Pavel! that definitely solved it.

  Unfortunately, the function I gave you was a simple/short version of what
 the actual function is going to be.  The actual function is going to get
 parameters passed to it, and based on the parameters will go through some
 if...else conditions, and maybe even call another function.  Based on that,
 I was definitely hoping to use plpgsql, and the overhead is unfortunate.

  Is there any way to get around this overhead?  Will I still have the same
 overhead if I use plperl, plpython, pljava, or write the function in C?


 You can probably still write it as an SQL function if you use CASE WHEN
 appropriately.

 --
 Craig Ringer



Re: [PERFORM] Raid 5 vs Raid 10 Benchmarks Using bonnie++

2011-08-17 Thread Greg Smith

On 08/17/2011 08:35 PM, mark wrote:

With XFS the mount option is nobarrier. With ext4 I think it is barrier=0


http://www.mjmwired.net/kernel/Documentation/filesystems/ext4.txt

ext4 supports both; nobarrier and barrier=0 mean the same thing.  I 
tend to use nobarrier just because I'm used to that name on XFS systems.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Raid 5 vs Raid 10 Benchmarks Using bonnie++

2011-08-17 Thread Ogden

On Aug 17, 2011, at 4:16 PM, Greg Smith wrote:

 On 08/17/2011 02:26 PM, Ogden wrote:
 I am using bonnie++ to benchmark our current Postgres system (on RAID 5) 
 with the new one we have, which I have configured with RAID 10. The drives 
 are the same (SAS 15K). I tried the new system with ext3 and then XFS but 
 the results seem really outrageous as compared to the current system, or am 
 I reading things wrong?
 
 The benchmark results are here:
 http://malekkoheavyindustry.com/benchmark.html
 
 
 Congratulations--you're now qualified to be a member of the RAID5 sucks 
 club.  You can find other members at 
 http://www.miracleas.com/BAARF/BAARF2.html  Reasonable read speeds and just 
 terrible write ones are expected if that's on your old hardware.  Your new 
 results are what I would expect from the hardware you've described.
 
 The only thing that looks weird are your ext4 Sequential Output - Block 
 results.  They should be between the ext3 and the XFS results, not far lower 
 than either.  Normally this only comes from using a bad set of mount options. 
  With a battery-backed write cache, you'd want to use nobarrier for 
 example; if you didn't do that, that can crush output rates.


Isn't this very dangerous? I have the Dell PERC H700 card - I see that it has 
512Mb Cache. Is this the same thing and good enough to switch to nobarrier? 
Just worried if a sudden power shut down, then data can be lost on this option. 

I did not do that with XFS and it did quite well - I know it's up to my app and 
more testing, but in your experience, what is usually a good filesystem to use? 
I keep reading conflicting things..

Thank you

Ogden



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] heavy load-high cpu itilization

2011-08-17 Thread Jim Nasby
On Jul 30, 2011, at 3:02 PM, Filippos wrote:
 thx a lot for your answer. 
 i will provide some stats, so if you could help me figure out the source of
 the problem that would be great 
 
 -*top -c*
 Tasks: 1220 total,  49 running, 1171 sleeping,   0 stopped,   0 zombie
 Cpu(s): *84.1%us*,  2.8%sy,  0.0%ni, 12.3%id,  0.1%wa,  0.1%hi,  0.6%si, 
 0.0%st
 Mem:  98846996k total, 98632044k used,   214952k free,   134320k buffers
 Swap: 50331640k total,   116312k used, 50215328k free, 89445208k cached

84% CPU isn't horrible, and you do have idle CPU time available. So you don't 
look to be too CPU-bound, although you need to keep in mind that one process 
might be CPU intensive and taking a long time to run, thereby blocking other 
processes that depend on it's results.

 -SELECT count(procpid) FROM pg_stat_activity - *422*
 -SELECT count(procpid) FROM pg_stat_activity WHERE (NOW() - query_start) 
 INTERVAL '1 MINUTES' AND current_query = 'IDLE' - *108*
 -SELECT count(procpid) FROM pg_stat_activity WHERE (NOW() - query_start) 
 INTERVAL '5 MINUTES' AND current_query = 'IDLE' - *45*

It would be good to look at getting some connection pooling happening.

Your vmstat output shows you generally have CPU available. Can you provide some 
output from iostat -xk 2?

 -*vmstat -n 1 10*
 procs ---memory-- ---swap-- -io --system--
 -cpu--
 r  b   swpd   free   buff  cache   si   sobibo   in   cs us sy id
 wa st
 41  1 116300 347008 134176 8960891200   143   21000 11  1 88 
 0  0
 20  0 116300 423556 134116 8958184000  8336  3038 8 21139 81  5
 13  0  0
 24  0 116300 412904 134108 8954684000  8488  9025 10621 22921 81  4
 15  0  0
 23  0 116300 409388 134084 8951372800  8320   548 11386 20226 82  4
 14  0  0
 34  0 116300 403688 134088 8950952000  6336 0 9552 20994 83  3
 14  0  0
 22  1 116300 337972 134104 8951862400  879228 8980 20455 83  4
 13  0  0
 37  0 116300 303956 134116 8952872000  8440   536 9644 20492 84  3
 13  0  0
 17  1 116300 293212 134112 8953281600  5864  8240 9527 19771 85  3
 12  0  0
 14  0 116300 282168 134116 8954072000  7772   752 10141 21780 84  3
 13  0  0
 44  0 116300 278684 134100 8953608000  7352   555 9856 21539 85  2
 13  0  0
 
 -*vmstat -s*
98846992  total memory
98685392  used memory
40342200  active memory
52644588  inactive memory
  161604  free memory
  129960  buffer memory
89421936  swap cache
50331640  total swap
  116300  used swap
50215340  free swap
  2258553017 non-nice user cpu ticks
 1125281 nice user cpu ticks
   146638389 system cpu ticks
 17789847697 idle cpu ticks
83090716 IO-wait cpu ticks
 5045742 IRQ cpu ticks
38895985 softirq cpu ticks
   0 stolen cpu ticks
 29142450583 pages paged in
 42731005078 pages paged out
   39784 pages swapped in
 3395187 pages swapped out
  1338370564 interrupts
  1176640487 CPU context switches
  1305704895 boot time
24471946 forks
 
 (after 30 sec)
 -*vmstat -s*
98846992  total memory
98367312  used memory
39959952  active memory
52957104  inactive memory
  479684  free memory
  129720  buffer memory
89410640  swap cache
50331640  total swap
  116296  used swap
50215344  free swap
  2258645091 non-nice user cpu ticks
 1125282 nice user cpu ticks
   146640181 system cpu ticks
 17789863186 idle cpu ticks
83090856 IO-wait cpu ticks
 5045855 IRQ cpu ticks
38896749 softirq cpu ticks
   0 stolen cpu ticks
 29142861271 pages paged in
 42731249289 pages paged out
   39784 pages swapped in
 3395187 pages swapped out
  1338808821 interrupts
  1177463384 CPU context switches
  1305704895 boot time
24472003 forks
 
 from the above - context switches /s = (1177463384 - 1176640487)/30 =
 *27429*
 
 thx in advance for any advice
 
 
 
 
 --
 View this message in context: 
 http://postgresql.1045698.n5.nabble.com/heavy-load-high-cpu-itilization-tp4647751p4650542.html
 Sent from the PostgreSQL - performance mailing list archive at Nabble.com.
 
 -- 
 Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-performance
 

--
Jim C. Nasby, Database Architect   j...@nasby.net
512.569.9461 (cell) http://jim.nasby.net



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Raid 5 vs Raid 10 Benchmarks Using bonnie++

2011-08-17 Thread Craig Ringer

On 18/08/2011 11:48 AM, Ogden wrote:

Isn't this very dangerous? I have the Dell PERC H700 card - I see that it has 
512Mb Cache. Is this the same thing and good enough to switch to nobarrier? 
Just worried if a sudden power shut down, then data can be lost on this option.


Yeah, I'm confused by that too. Shouldn't a write barrier flush data to 
persistent storage - in this case, the RAID card's battery backed cache? 
Why would it force a RAID controller cache flush to disk, too?


--
Craig Ringer

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] How to see memory usage using explain analyze ?

2011-08-17 Thread hyelluas
Igor,

Thank you for the hint, I read about the planner, added vacuum analyze  to
my procedures.

There is no join in my query but GROUP BY that is taking all the time and I
don't know how to tune it.
It gets executed by the procedure, the execution time requirement is  4
sec, 
but it takes 8-11 sec against 3 partitions , 9 mln rec each, it goes to 22
sec for 5 partitions.
  

I've been testing PostgreSQL performance for the last 2 months, comparing it
whith MySQL, 
PostgreSQL performance with 5+ mln records on the table with 14 columns is
worse.
Is 14 columns is a big table for Postgres or 5mln rec is a big table?

The whole picture is that there are 2 databases : OLTP  OLAP that use to
be on different machines and on different databases.
The new project requires to put it on one database  machine.

I preferred Postgres ( poorly designed oltp would not suffer even more on
mysql) and now I'm trying to tune OLAP db.

thank you.

Helen


--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/How-to-see-memory-usage-using-explain-analyze-tp4694962p4709415.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] tunning strategy needed

2011-08-17 Thread hyelluas
Hello,

I have an old application that was written on Postgres 8.1.
There are a few hundreds tables, 30-40 columns per table,  hundreds of
views,  and all the sql is inside java code.

We are moving it to 8.4, it seems to be VERY slow.
There are 20-30 tables transactions - the objects are spread acrross
multiple tables and some tables have data from different objects. 

I need a short term tuning strategy minimizing rewrite   redesign.

Should I start with replacing the sql with procedures?
 
Should I start with replacing the views with the procedures to save time on
recreating an execution plan and parsing?

Should I start with tuning server parameters ?
  
all your suggestions are greatly appreciated!

thank you.

Helen

--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/tunning-strategy-needed-tp4710245p4710245.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Calculating statistic via function rather than with query is slowing my query

2011-08-17 Thread Craig Ringer

On 18/08/2011 9:03 AM, Anish Kejariwal wrote:
Thanks for the help Pavel and Craig.  I really appreciate it.  I'm 
going to try a couple of these different options (write a c function, 
use a sql function with case statements, and use plperl), so I can see 
which gives me the realtime performance that I need, and works best 
for clean code in my particular case.

Do you really mean realtime? Or just fast?

If you have strongly bounded latency requirements, any SQL-based, 
disk-based system is probably not for you. Especially not one that 
relies on a statics-based query planner, caching, and periodic 
checkpoints. I'd be looking into in-memory databases designed for 
realtime environments where latency is critical.


Hard realtime: If this system fails to respond within x milliseconds, 
all the time, every time, then something will go smash or boom 
expensively and unrecoverably.


Soft realtime: If this system responds late, the late response is 
expensive or less useful. Frequent late responses are unacceptable but 
the occasional one might be endurable.


Just needs to be fast: If it responds late, the user gets irritated 
because they're sitting and waiting for a response. Regular long stalls 
are unacceptable, but otherwise the user can put up with it. You're more 
concerned with average latency than maximum latency.


--
Craig Ringer

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] tunning strategy needed

2011-08-17 Thread Craig Ringer

On 18/08/2011 6:40 AM, hyelluas wrote:

Hello,

I have an old application that was written on Postgres 8.1.
There are a few hundreds tables, 30-40 columns per table,  hundreds of
views,  and all the sql is inside java code.

We are moving it to 8.4, it seems to be VERY slow.
There are 20-30 tables transactions - the objects are spread acrross
multiple tables and some tables have data from different objects.

I need a short term tuning strategy minimizing rewrite  redesign.




- Turn on auto explain and slow query logging

- Examine the slow queries and plans. Run them manually with EXPLAIN 
ANALYZE. Check that the statistics make sense and if they're inaccurate, 
increase the statistics targets on those columns/tables then re-ANALYZE.


- If the stats are accurate but the query is still slow, try playing 
with the cost parameters and see if you get a better result, then test 
those settings server-wide to see if they improve overall performance.


--
Craig Ringer

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance