Re: [PERFORM] Why should such a simple query over indexed columns be so slow?

2012-01-30 Thread Fernando Hevia
On Mon, Jan 30, 2012 at 17:35, Alessandro Gagliardi alessan...@path.comwrote:

 Well that was a *lot* faster:

 HashAggregate  (cost=156301.82..156301.83 rows=2 width=26) (actual
 time=2692.806..2692.807 rows=2 loops=1)
   -  Bitmap Heap Scan on blocks  (cost=14810.54..155828.95 rows=472871
 width=26) (actual time=289.828..1593.893 rows=575186 loops=1)
 Recheck Cond: (created  '2012-01-29 00:00:00+00'::timestamp
 with time zone)
 Filter: (shared IS FALSE)
 -  Bitmap Index Scan on blocks_created_idx  (cost=0.00..14786.89
 rows=550404 width=0) (actual time=277.407..277.407 rows=706663 loops=1)
   Index Cond: (created  '2012-01-29 00:00:00+00'::timestamp
 with time zone)
 Total runtime: 2693.107 ms


U sure the new timing isn't owed to cached data? If I am reading it
correctly, from the latest explain you posted the Index Scan shouldn't have
made a difference as it is reporting pretty much all rows in the table have
created  'yesterday'.
If the number of rows with created  'yesterday' isn't significant (~ over
25% with default config) a full scan will be chosen and it will probably be
the better choice too.


Re: [PERFORM] PostgreSQL-9.0 Monitoring System to improve performance

2011-10-07 Thread Fernando Hevia
pgwatch might also be worth taking a look at:
http://www.cybertec.at/en/postgresql_products/pgwatch-cybertec-enterprise-postgresql-monitor

Fernando.-

On Fri, Sep 30, 2011 at 18:29, Bobby Dewitt bdew...@appriss.com wrote:

 EnterpriseDB now has Postgres Enterprise Manager
 (
 http://enterprisedb.com/products-services-training/products/postgres-enter
 prise-manager) that has some of the information that is being asked for.
 It has a hot table analysis report that shows the number of scans, rows
 read, etc.  Since much of the tool is using the pgAdmin code base, much of
 this is also available in pgAdmin but PEM will track the statistics at
 given intervals and show you trending graphs over time.  It's still a very
 new tool so I'm sure they are working to add new features and have been
 looking for enhancement suggestions.  Of course, it requires a service
 contract with them to use the tool, but it doesn't cost extra to add the
 tool if you already have a contract with them.  It does have a 45 day
 evaluation if you wanted to check it out.

 Hope that helps.
 Bobby

 On 9/30/11 7:53 AM, Gregg Jaskiewicz gryz...@gmail.com wrote:

 Looks like this is generally an area that can be targeted by some
 businesses. Or an open source enthusiast.
 One centre that captures all the information and produces a report
 based on it would be a great thing. Especially in cases like mine,
 where I have tens of postgresql installations on different hardware
 and with different use patterns (but schemas and queries are the
 same).


 --
 Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-performance



Re: [PERFORM] postgres performance tunning

2010-12-20 Thread Fernando Hevia
On Fri, Dec 17, 2010 at 07:48, selvi88 selvi@gmail.com wrote:


 My requirement is more than 15 thousand queries will run,
 It will be 5000 updates and 5000 insert and rest will be select.


What IO system are you running Postgres on? With that kind of writes you
should be really focusing on your storage solution.


 Each query will be executed in each psql client, (let say for 15000 queries
 15000 thousand psql connections will be made).


You will benefit from a connection pooler. Try fiddling with
maximum_connections till you hit a sweet spot. Probably you should start
with 20 connections and go up till you see your tps decrease.

Still, without deeply looking into your storage I wonder if you'll ever
reach your TPS objective.


Re: [PERFORM] bgwriter, checkpoints, curious (seeing delays)

2010-02-26 Thread Fernando Hevia
 

 -Mensaje original-
 De: Tory M Blue
 
 2010/2/25 Devrim GÜNDÜZ dev...@gunduz.org:
  On Thu, 2010-02-25 at 22:12 -0800, Tory M Blue wrote:
  shared_buffers = 1500MB
 
  Some people tend to increase this to 2.2GB(32-bit) or 4-6 
 GB (64 bit), 
  if needed. Please note that more shared_buffers will lead to more 
  pressure on bgwriter, but it also has lots of benefits, too.
 
  work_mem = 100MB
 
  This is too much. Since you have 300 connections, you will probably 
  swap because of this setting, since each connection may use 
 this much 
  work_mem. The rule of the thumb is to set this to a lower general 
  value (say, 1-2 MB), and set it per-query when needed.
 
 I'm slightly confused. Most things I've read, including 
 running pg_tune for grins puts this around 100MB, 98MB for 
 pgtune.  1-2MB just seems really low to me. And Ignore the 
 300 connections, thats an upper limit, I usually run a max of 
 40-45 but usually around 20 connections per sec.
 

If you have maximum 45 users running simultaneously a rather complex query
that requires... say 3 sorts, thats 45 x 100MB x 3 = 13.5 GB of RAM used up
because of this particular work_mem setting. Doesn't mean it will happen
just that your settings make this scenario possible.

So, to avoid this scenario, the suggestion is to keep work_mem low and
adjust it per query if required. I find 1-2 MB too low for my particular
requirements so I have it in 8 MB. Anyway, due to your server having a lot
of RAM your setting might make sense. But all that memory would probably be
better used if it was available for caching.

 
 Also is there a way to log if there are any deadlocks 
 happening (I'm not seeing them in logs) deadlock_timeout = 5s
 

In postgresql.conf:
log_lock_waits = on # log lock waits = deadlock_timeout


Regards,
Fernando.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] new server I/O setup

2010-01-15 Thread Fernando Hevia
 

 -Mensaje original-
 De: Scott Marlowe 
 
 I think your first choice is right.  I use the same basic 
 setup with 147G 15k5 SAS seagate drives and the pg_xlog / OS 
 partition is almost never close to the same level of 
 utilization, according to iostat, as the main 12 disk RAID-10 
 array is.  We may have to buy a 16 disk array to keep up with 
 load, and it would be all main data storage, and our pg_xlog 
 main drive pair would be just fine.
 


  Do you think a single RAID 1 will become a 
 bottleneck? 
  Feel free to suggest a better setup I hadn't considered, it 
 would be 
  most welcome.
 
 For 12 disks, most likely not.  Especially since your load is 
 mostly small randomish writes, not a bunch of big 
 multi-megabyte records or anything, so the random access 
 performance on the 12 disk RAID-10 should be your limiting factor.
 

Good to know this setup has been tryied succesfully. 
Thanks for the comments.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] new server I/O setup

2010-01-15 Thread Fernando Hevia
 

 -Mensaje original-
 De: Greg Smith
 
 Fernando Hevia wrote: 
 
  I justified my first choice in that WAL writes are 
 sequentially and OS pretty much are too, so a RAID 1 probably 
 would hold ground against a 12 disc RAID 10 with random writes.
 
 
 The problem with this theory is that when PostgreSQL does WAL 
 writes and asks to sync the data, you'll probably discover 
 all of the open OS writes that were sitting in the Linux 
 write cache getting flushed before that happens.  And that 
 could lead to horrible performance--good luck if the database 
 tries to do something after cron kicks off updatedb each 
 night for example.
 

I actually hadn't considered such a scenario. It probably wont hit us
because our real-time activity diminishes abruptly overnight when
maintainance routines kick in.
But in case this proves to be an issue, disabling synchronous_commit should
help out, and thanks to the BBU cache the risk of lost transactions should
be very low. In any case I would leave it on till the issue arises. Do you
agree?

In our business worst case situation could translate to losing a couple
seconds worth of call records, all recoverable from secondary storage.


 I think there are two viable configurations you should be 
 considering you haven't thought about:
 , but neither is quite what you're looking at:
 
 2 discs in RAID 1 for OS
 2 discs in RAID 1 for pg_xlog
 10 discs in RAID 10 for postgres, ext3
 2 spares.
 
 14 discs in RAID 10 for everything
 2 spares.
 
 Impossible to say which of the four possibilities here will 
 work out better.  I tend to lean toward the first one I 
 listed above because it makes it very easy to monitor the 
 pg_xlog activity (and the non-database activity) separately 
 from everything else, and having no other writes going on 
 makes it very unlikely that the pg_xlog will ever become a 
 bottleneck.  But if you've got 14 disks in there, it's 
 unlikely to be a bottleneck anyway.  The second config above 
 will get you slightly better random I/O though, so for 
 workloads that are really limited on that there's a good 
 reason to prefer it.
 

Beside the random writing, we have quite intensive random reads too. I need
to maximize throughput on the RAID 10 array and it makes me feel rather
uneasy the thought of taking 2 more disks from it.
I did consider the 14 disks RAID 10 for all since it's very attractive for
read I/O. But with 12 spins read I/O should be incredibly fast for us
considering our current production server has a meager 4 disk raid 10.
I still think the 2d RAID 1 + 12d RAID 10 will be the best combination for
write throughput, providing the RAID 1 can keep pace with the RAID 10,
something Scott already confirmed to be his experience.

 Also:  the whole use ext2 for the pg_xlog idea is overrated 
 far as I'm concerned.  I start with ext3, and only if I get 
 evidence that the drive is a bottleneck do I ever think of 
 reverting to unjournaled writes just to get a little speed 
 boost.  In practice I suspect you'll see no benchmark 
 difference, and will instead curse the decision the first 
 time your server is restarted badly and it gets stuck at fsck.
 

This advice could be interpreted as start safe and take risks only if
needed
I think you are right and will follow it.

  Pd: any clue if hdparm works to deactive the disks 
 write cache even if they are behind the 3ware controller?
  
 
 You don't use hdparm for that sort of thing; you need to use 
 3ware's tw_cli utility.  I believe that the individual drive 
 caches are always disabled, but whether the controller cache 
 is turned on or not depends on whether the card has a 
 battery.  The behavior here is kind of weird though--it 
 changes if you're in RAID mode vs. JBOD mode, so be careful 
 to look at what all the settings are.  Some of these 3ware 
 cards default to extremely aggressive background scanning for 
 bad blocks too, you might have to tweak that downward too.
 

It has a battery and it is working in RAID mode. 
It's also my first experience with a hardware controller. Im installing
tw_cli at this moment.

Greg, I hold your knowledge in this area in very high regard. 
Your comments are much appreciated.


Thanks,
Fernando


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] new server I/O setup

2010-01-15 Thread Fernando Hevia
 

 -Mensaje original-
 De: Matthew Wakeling [mailto:matt...@flymine.org] 
 Enviado el: Viernes, 15 de Enero de 2010 08:21
 Para: Scott Marlowe
 CC: Fernando Hevia; pgsql-performance@postgresql.org
 Asunto: Re: [PERFORM] new server I/O setup
 
 On Thu, 14 Jan 2010, Scott Marlowe wrote:
  I've just received this new server:
  1 x XEON 5520 Quad Core w/ HT
  8 GB RAM 1066 MHz
  16 x SATA II Seagate Barracuda 7200.12 3ware 9650SE w/ 256MB BBU
 
  2 discs in RAID 1 for OS + pg_xlog partitioned with ext2.
  12 discs in RAID 10 for postgres data, sole partition with ext3.
  2 spares
 
  I think your first choice is right.  I use the same basic 
 setup with 
  147G 15k5 SAS seagate drives and the pg_xlog / OS partition 
 is almost 
  never close to the same level of utilization, according to 
 iostat, as 
  the main 12 disk RAID-10 array is.  We may have to buy a 16 
 disk array 
  to keep up with load, and it would be all main data 
 storage, and our 
  pg_xlog main drive pair would be just fine.
 
 The benefits of splitting off a couple of discs for WAL are 
 dubious given the BBU cache, given that the cache will 
 convert the frequent fsyncs to sequential writes anyway. My 
 advice would be to test the difference. If the bottleneck is 
 random writes on the 12-disc array, then it may actually help 
 more to improve that to a 14-disc array instead.

I am new to the BBU cache benefit and I have a lot to experience and learn.
Hopefully I will have the time to tests both setups.
I was wondering if disabling the bbu cache on the RAID 1 array would make
any difference. All 256MB would be available for the random I/O on the RAID
10.

 
 I'd also question whether you need two hot spares, with 
 RAID-10. Obviously that's a judgement call only you can make, 
 but you could consider whether it is sufficient to just have 
 a spare disc sitting on a shelf next to the server rather 
 than using up a slot in the server. Depends on how quickly 
 you can get to the server on failure, and how important the data is.
 

This is something I havent been able to make my mind since its very painful
to loose those 2 slots.
They could make for the dedicated pg_xlog RAID 1 Greg's suggesting.
Very tempting, but still think I will start safe for know and see what
happens later.

Thanks for your hindsight.

Regards,
Fernando.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] new server I/O setup

2010-01-15 Thread Fernando Hevia
 

 -Mensaje original-
 De: Pierre Frédéric Caillaud
 Enviado el: Viernes, 15 de Enero de 2010 15:00
 Para: pgsql-performance@postgresql.org
 Asunto: Re: [PERFORM] new server I/O setup
 
 
   No-one has mentioned SSDs yet ?...
 

The post is about an already purchased server just delivered to my office. I
have been following with interest posts about SSD benchmarking but no SSD
have been bought this oportunity and we have no budget to buy them either,
at least not in the foreseable future.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] new server I/O setup

2010-01-14 Thread Fernando Hevia
Hi all,
 
I've just received this new server:
1 x XEON 5520 Quad Core w/ HT
8 GB RAM 1066 MHz
16 x SATA II Seagate Barracuda 7200.12
3ware 9650SE w/ 256MB BBU
 
It will run an Ubuntu 8.04 LTS Postgres 8.4 dedicated server. Its database
will be getting between 100 and 1000 inserts per second (those are call
detail records of ~300 bytes each) of around 20 clients (voip gateways).
Other activity is mostly read-only and some non time-critical writes
generally at off peak hours.
 
So my first choice was:
 
2 discs in RAID 1 for OS + pg_xlog partitioned with ext2.
12 discs in RAID 10 for postgres data, sole partition with ext3.
2 spares
 
 
My second choice is:
 
4 discs in RAID 10 for OS + pg_xlog partitioned with ext2
10 discs in RAID 10 for postgres, ext3
2 spares.
 
The bbu caché will be enabled for both raid volumes.
 
I justified my first choice in that WAL writes are sequentially and OS
pretty much are too, so a RAID 1 probably would hold ground against a 12
disc RAID 10 with random writes.
 
I don't know in advance if I will manage to gather enough time to try out
both setups so I wanted to know what you guys think of these 2 alternatives.
Do you think a single RAID 1 will become a bottleneck? Feel free to suggest
a better setup I hadn't considered, it would be most welcome.
 
Pd: any clue if hdparm works to deactive the disks write cache even if they
are behind the 3ware controller?
 
Regards,
Fernando.
 


Re: [PERFORM] Inserting 8MB bytea: just 25% of disk perf used?

2010-01-14 Thread Fernando Hevia
 

 -Mensaje original-
 De: fka...@googlemail.com

 Nevertheless: If your explanation covers all what can be said 
 about it then replacing the hard disk by a faster one should 
 increase the performance here (I'll try to check that out).
 

Moving the pg_xlog directory to the OS drive should make a difference and it
will cost you zero.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] forced sequential scan when condition has current_user

2010-01-05 Thread Fernando Hevia
 

 -Mensaje original-
 De: Keresztury Balázs
 
 hi,
 
 just a small question: is it normal that PostgreSQL 8.4.1 
 always uses sequential scanning on any table when there is a 
 condition having the constant current_user? Of course there 
 is a btree index set on that table, but the DBMS just doesn't 
 want to utilize it. When I replace current_user to any 
 string, the planner uses the index normally.
 
 I can demonstrate it with the following simple query:
 
 SELECT psz.kotesszam FROM projekt.projektszervezet psz WHERE 
 psz.felhasznalo_id = current_user;
 

Probably you are comparing different types. Try explicitly casting
current_user to text:

SELECT psz.kotesszam FROM projekt.projektszervezet psz WHERE 
psz.felhasznalo_id = current_user::text




-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Server Freezing

2009-12-01 Thread Fernando Hevia
 

 -Mensaje original-
 De: Waldomiro
 
 I´m using PostgreSQL 8.1, 

Sorry, log_checkpoints isn't supported till 8.3

 and my settings are:
 
 checkpoint_segments=50
 checkpoint_timeout=300
 checkpoint_warning=30
 commit_delay=0
 commit_siblings=5
 archive_command= cp -i %p/BACKUP/LOGS/%f autovacuum=off
 bgwriter_all_maxpages=5
 bgwriter_all_percent=0.333
 bgwriter_delay=200
 bgwriter_lru_maxpages=5
 bgwriter_lru_percent=1
 fsync=on
 full_page_writes=on
 stats_block_level=on
 stats_command_string=on
 stats_reset_on_server_start=off
 stats_row_level=on
 stats_start_collector=on
 

As tempting as it is to decrease checkpoint_segments, better confirm it is a
checkpoint related problem before fiddling with these settings.

I recommend reading Greg Smith's post on checkpoints  bg writer. It's about
8.3 improvements but it includes good advice on how to diagnose checkpoint
issues on prior versions:
http://www.westnet.com/~gsmith/content/postgresql/chkp-bgw-83.htm

In fact, one of his recomendations should be very helpful here: set
checkpoint_warning=3600 and log_min_duration_statement=1000, that way you
should see in the log if statements over 1 sec occur simultaneously with
checkpoints being reached.

Pay attention to the chapter on the bg_writer too.

Regards,
Fernando.



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Server Freezing

2009-11-30 Thread Fernando Hevia
 

 -Mensaje original-
 De: pgsql-performance-ow...@postgresql.org 
 [mailto:pgsql-performance-ow...@postgresql.org] En nombre de Waldomiro
 Enviado el: Lunes, 30 de Noviembre de 2009 22:03
 Para: pgsql-performance@postgresql.org
 Asunto: [PERFORM] Server Freezing
 
 Hi everybody,
 
 ...
 
 That table receives about 3000 inserts and 6 updates each 
 day, but at night I do a TRUNCATE TABLE1 (Every Night), so 
 the table is very small. There is an index by field1 too.
 
 Some days It works very good all day, but somedays I have 7 
 seconds freeze, I mean, my serves delays 7 seconds on  this statement:
   SELECT field1
   FROM TABLE1
   WHERE field2 = '10'

Hi.
You should probably consider creating a partial index on field2 = '10'.

 I looked up to the statistics for that table, but the 
 statistics says that postgres is reading memory, not disk, 
 becouse the table is very small and I do a select every 
 second, so the postgres keeps the table in shared buffers.


You say you dont vacuum this table, but considering 6 updates on 3000
records, assuming you are updating each record 20 times, your table could
eat up the space of 60M records. ¿Have you considered this?

Though, I am not sure how this impacts when the whole table is held in
shared buffers.

 
 Why this 7 seconds delay? How could I figure out what is happening?
 

Turn log_checkpoints = on to see in the logs if these occur during the
freeze.
Also log_lock_waits = on will help diagnose the situation.

What version of postgres are you running and how are your checkpoints
configured?

Regards,
Fernando.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Postgres query completion status?

2009-11-20 Thread Fernando Hevia
 

 -Mensaje original-
 De: Richard Neill
 
 
 max_connections = 500   # (change requires restart)
 work_mem = 256MB# min 64kB

Not that it has to do with your current problem but this combination could
bog your server if enough clients run sorted queries simultaneously.
You probably should back on work_mem at least an order of magnitude.



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Postgres query completion status?

2009-11-20 Thread Fernando Hevia
 

 -Mensaje original-
 De: Richard Neill
 
 Fernando Hevia wrote:
   
  
  -Mensaje original-
  De: Richard Neill
 
 
  max_connections = 500   # (change requires restart)
  work_mem = 256MB# min 64kB
  
  Not that it has to do with your current problem but this 
 combination 
  could bog your server if enough clients run sorted queries 
 simultaneously.
  You probably should back on work_mem at least an order of magnitude.
  
 
 What's the correct way to configure this?
 
 * We have one client which needs to run really big 
 transactions (therefore needs the work memory).
 
 * We also have about 200 clients which run always very small, 
 short queries.
 
 Richard
 

Set the default value at postgresql.conf much lower, probably 4MB.
And just before running any big transaction raise it for 
the current session only issuing a:
  set work_mem = '256MB';

Regards,
Fernando.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] SSD + RAID

2009-11-13 Thread Fernando Hevia
 

 -Mensaje original-
 Laszlo Nagy
 
 My question is about the last option. Are there any good RAID 
 cards that are optimized (or can be optimized) for SSD 
 drives? Do any of you have experience in using many cheaper 
 SSD drives? Is it a bad idea?
 
 Thank you,
 
Laszlo
 

Never had a SSD to try yet, still I wonder if software raid + fsync on SSD
Drives could be regarded as a sound solution?
Shouldn't their write performance be more than a trade-off for fsync?

You could benchmark this setup yourself before purchasing a RAID card.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] High CPU load on Postgres Server during Peak times!!!!

2009-09-23 Thread Fernando Hevia

 User Access
 Total Number of Users is 500
 Maximum number of Concurrent users will be 500 during peak time
 Off Peak time the maximum number of concurrent user will be
 around 150 to 200.


A connection pooler like pgpool or pgbouncer would considerably reduce the
burden on your system.


I am already using connection pooling in tomcat web server, so installing
pgpool
will help enhancing the performance ?Any changes i have to do in my
application to 
include pgpool? 


There shouldn't be need for another pooling solution.
Anyway, you probably dont want 1800 concurrent connections on your database
server, nor even get near that number.

Check the number of actual connections with: 
  select count(*) from pg_stat_activity;

A vmstat run during high loads could provide a hindsight to if the number of
connections is straining your server.

If the number of connections is high (say over 200-300), try reducing the
pool size in Tomcat and see what happens.
You possibly could do fine with something between 50 and 100 connections.


Regards,
Fernando.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] High CPU load on Postgres Server during Peak times!!!!

2009-09-22 Thread Fernando Hevia
 

 -Mensaje original-
 De: Shiva Raman
 Enviado el: Martes, 22 de Septiembre de 2009 10:55
 Para: pgsql-performance@postgresql.org
 Asunto: [PERFORM] High CPU load on Postgres Server during 
 Peak times
 
 Dear all 
 
   I am having a problem of high cpu loads in my postgres 
 server during peak time. 


Some quick advice:

 
 clusternode2:~ # rpm -qa | grep postgres
 postgresql-devel-8.1.9-1.2
 postgresql-8.1.9-1.2
 postgresql-docs-8.1.9-1.2
 postgresql-server-8.1.9-1.2
 postgresql-libs-64bit-8.1.9-1.2
 postgresql-libs-8.1.9-1.2
 postgresql-jdbc-8.1-12.2
 postgresql-contrib-8.1.9-1.2
 
 

8.1 is quite old. Consider upgrading as newer versions are faster.
Current Postgres version is 8.4. 

 
 High Availability Cluster with two IBM P Series Server and 
 one DS4700 Storage
 
 IBM P series P52A with 2-core 2.1 Ghz POWER5+ Processor Card 
 , 36 MB L3 Cache ,16 GB of RAM,
 73.4 GB 10,000 RPM Ultra320 SCSI Drive for Operating System .  
 

Sounds you are underpowered on cpu for 500 concurrent users.
Of course this really depends on what they are doing.

 
 IBM SAN DS4700 Storage with Fibre Channel HDD (73.4 GB * 10) 
 Two Partitions - 73.4 GB * 3 RAID 5 - 134 GB storage 
 partitions (One holding Jakarata tomcat
 application server and other holding Postgresql Database) .
 Four Hard disk RAID 5 with ext3 file systems hold the pgdata on SAN . 
 Hard disk rotational speed is 73 GB 15K IBM 2 GB Fibre channel 
 

A more suitable partitioning for an OLTP database would be:

2 x 73.4 GB RAID 1 for App Server + Postgresql and pg_xlog
8 x 73.4 GB RAID 10 for pgdata

RAID 5 is strongly discouraged.

 
 Following is the output of TOP command during offpeak time. 
 
 
 top - 18:36:56 up 77 days, 20:33,  1 user,  load average: 
 12.99, 9.22, 10.37
 Tasks: 142 total,  12 running, 130 sleeping,   0 stopped,   0 zombie
 Cpu(s): 46.1%us,  1.9%sy,  0.0%ni,  6.1%id,  3.0%wa,  0.0%hi, 
  0.1%si, 42.9%st
 Mem:  16133676k total, 13657396k used,  2476280k free,   
 450908k buffers
 Swap: 14466492k total,  124k used, 14466368k free, 
 11590056k cached
 
  PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  
 COMMAND   
 22458 postgres  19   0 2473m 477m 445m R   40  3.0   0:15.49 
 postmaster
 22451 postgres  15   0 2442m 447m 437m S   33  2.8   0:30.44 
 postmaster
 22464 postgres  17   0 2443m 397m 383m R   28  2.5   0:13.78 
 postmaster
 22484 postgres  16   0 2448m 431m 412m S   20  2.7   0:02.73 
 postmaster
 22465 postgres  17   0 2440m 461m 449m R   15  2.9   0:03.52 
 postmaster
 22452 postgres  16   0 2450m 727m 706m R   13  4.6   0:23.46 
 postmaster
 22476 postgres  16   0 2437m 413m 405m S   13  2.6   0:06.11 
 postmaster
 22485 postgres  16   0 2439m 230m 222m R7  1.5   0:05.72 
 postmaster
 22481 postgres  15   0 2436m 175m 169m S7  1.1   0:04.44 
 postmaster
 22435 postgres  17   0 2438m 371m 361m R6  2.4   1:17.92 
 postmaster
 22440 postgres  17   0 2445m 497m 483m R5  3.2   1:44.50 
 postmaster
 22486 postgres  17   0 2432m  84m  81m R4  0.5   0:00.76 
 postmaster
 

Are you running several Postgres clusters on this hardware?
Please post Top output showing cmd line arguments (press 'c')


 
 User Access 
 Total Number of Users is 500 
 Maximum number of Concurrent users will be 500 during peak time
 Off Peak time the maximum number of concurrent user will be 
 around 150 to 200. 
 

A connection pooler like pgpool or pgbouncer would considerably reduce the
burden on your system.


Regards,
Fernando.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Planner question - bit data types

2009-09-07 Thread Fernando Hevia
 

 -Mensaje original-
 De: Karl Denninger
 Enviado el: Sábado, 05 de Septiembre de 2009 21:19
 Para: Alvaro Herrera
 CC: Tom Lane; Merlin Moncure; Josh Berkus; 
 pgsql-performance@postgresql.org
 Asunto: Re: [PERFORM] Planner question - bit data types
 
 There was a previous thread and I referenced it. I don't have 
 the other one in my email system any more to follow up to it.
 
 I give up; the attack-dog crowd has successfully driven me off.  Ciao.
 
 Alvaro Herrera wrote: 
 
   Karl Denninger escribió:
 
 
   Tom Lane wrote:
   
 
   
 
 
   You never showed us any EXPLAIN results,
 
 
   Yes I did.  Go back and look at the archives.  
 I provided full EXPLAIN
   and EXPLAIN ANALYZE results for the original 
 query.  Sheesh.
   
 
   
   You did?  Where?  This is your first message in this thread:
   
 http://archives.postgresql.org/pgsql-performance/2009-09/msg00059.php
   No EXPLAINs anywhere to be seen.
   

I guess this is the post Karl refers to:

http://archives.postgresql.org/pgsql-sql/2009-08/msg00088.php

Still you can't hope that others will recall a post 2 weeks ago, with an
other subject and in an other list!



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] PostgreSQL with PostGIS on embedded hardware

2009-05-08 Thread Fernando Hevia
 

 -Mensaje original-
 De: Paolo Rizzi
 
 Hi all,
 recently I came across a question from a customer of mine, 
 asking me if it would feasible to run PostgreSQL along with 
 PostGIS on embedded hardware.
 They didn't give me complete information, but it should be 
 some kind of industrial PC with a 600MHz CPU. Memory should 
 be not huge nor small, maybe a couple of GBytes, hard disk 
 should be some type of industrial Compact Flash of maybe 16 GBytes.
 
 They are thinking about using this setup on-board of public 
 buses and trams, along with a GPS receiver, for 
 self-localization. So that when the bus or tram enters 
 defined zones or passes near defined points, events are triggered.
 The database could probably be used completely read-only or 
 almost that.
 

Hi Paolo,

I'm not really responding to your question. It happens that I collaborated
on a postgres/postgis based solution for public transportation and the
motive why you are trying to put the database in the embedded hardware is
puzzling to me. In this solution we used a centralized PG database, the
devices in buses captured geographical position and other business related
data and fetched it by cellular network to the central server. 
Calculations on position where made on the server and related events where
fetched back accordingly.

If possible, I would like to know what drives you to put a database on each
device? You dont have a wireless link on each unit?


 What performances do you think would be possible for 
 PostgreSQL+PostGIS on such hardware???

We never considered that solution so I couldn´t say.

 
 Bye
 Paolo
 

Regards,
Fernando.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Performance Question

2008-11-12 Thread Fernando Hevia
Incrementing shared_buffers to 1024MB and set effective_cache_size to 6000MB
and test again.
To speed up sort operations, increase work_mem till you notice an
improvement.
Play with those settings with different values.
 


  _  

De: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] En nombre de - -
Enviado el: Miércoles, 12 de Noviembre de 2008 14:28
Para: pgsql-performance@postgresql.org
Asunto: [PERFORM] Performance Question


I've been searching for performance metrics and tweaks for a few weeks now.
I'm trying to determine if the length of time to process my queries is
accurate or not and I'm having a difficult time determining that. I know
postgres performance is very dependent on hardware and settings and I
understand how difficult it is to tackle. However, I was wondering if I
could get some feedback based on my results please.

The database is running on a dual-core 2GHz Opteron processor with 8GB of
RAM. The drives are 10K RPM 146GB drives in RAID 5 (I've read RAID 5 is bad
for Postgres, but moving the database to another server didn't change
performance at all). Some of the key parameters from postgresql.conf are:

max_connections = 100
shared_buffers = 16MB
work_mem = 64MB
everything else is set to the default

One of my tables has 660,000 records and doing a SELECT * from that table
(without any joins or sorts) takes 72 seconds. Ordering the table based on 3
columns almost doubles that time to an average of 123 seconds. To me, those
numbers are crazy slow and I don't understand why the queries are taking so
long. The tables are UTF-8 encode and contain a mix of languages (English,
Spanish, etc). I'm running the query from pgadmin3 on a remote host. The
server has nothing else running on it except the database.

As a test I tried splitting up the data across a number of other tables. I
ran 10 queries (to correspond with the 10 tables) with a UNION ALL to join
the results together. This was even slower, taking an average of 103 seconds
to complete the generic select all query.

I'm convinced something is wrong, I just can't pinpoint where it is. I can
provide any other information necessary. If anyone has any suggestions it
would be greatly appreciated. 





Re: [PERFORM] Best hardware/cost tradoff?

2008-09-01 Thread Fernando Hevia
 

 -Mensaje original-
 De: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] En nombre de cluster
 Enviado el: Sábado, 30 de Agosto de 2008 07:21
 Para: pgsql-performance@postgresql.org
 Asunto: Re: [PERFORM] Best hardware/cost tradoff?
 
 We are now leaning towards just buying 4 SAS disks.
 
 So should I just make one large RAID-10 partition or make two 
 RAID-1's having the log on one RAID and everything else on 
 the second RAID?
 How can I get the best read/write performance out of these four disks?
 (Remember, that it is a combined web-/database server).
 

Make a single RAID 10. It´s simpler and it will provide you better write
performance which is where your bottleneck will be. I think you should
minimize the web server role in this equation as it should mostly work on
cached data.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Best hardware/cost tradoff?

2008-08-28 Thread Fernando Hevia
 

 -Mensaje original-
 De: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] En nombre de cluster
 
 I'm about to buy a combined web- and database server. When 
 (if) the site gets sufficiently popular, we will split the 
 database out to a separate server.
 
 Our budget is limited, so how should we prioritize?
 
 * We think about buying some HP Proliant server with at least 
 4GB ram and at least a duo core processor. Possibly quad 
 core. The OS will be debian/Linux.
 
 * Much of the database will fit in RAM so it is not *that* 
 necessary to prefer the more expensive SAS 1 RPM drives 
 to the cheaper 7500 RPM SATA drives, is it? There will both 
 be many read- and write queries and a *lot* (!) of random reads.
 
 * I think we will go for hardware-based RAID 1 with a good 
 battery-backed-up controller. I have read that software RAID 
 perform surprisingly good, but for a production site where 
 hotplug replacement of dead disks is required, is software 
 RAID still worth it?
 
 Anything else we should be aware of?
 

I havent had any issues with software raid (mdadm) and hot-swaps. It keeps
working in degraded mode and as soon as you replace the defective disk it
can reconstruct the array on the fly. Performance will suffer while at it
but the service keeps up.
The battery backup makes a very strong point for a hw controller. Still, I
have heard good things on combining a HW controller with JBODS leaving the
RAID affair to mdadm. In your scenario though with *lots* of random reads,
if I had to choose between a HW controller  2 disks or software RAID with 4
or 6 disks, I would go for the disks. There are motherboards with 6 SATA
ports. For the money you will save on the controller you can afford 6 disks
in a RAID 10 setup. 

Cheers,
Fernando.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Big delete on big table... now what?

2008-08-25 Thread Fernando Hevia
 Gregory Stark [EMAIL PROTECTED] writes:
 
 Bill Moran [EMAIL PROTECTED] writes:
 
  Fernando Hevia [EMAIL PROTECTED] wrote:
  Hi list.
  I have a table with over 30 million rows. Performance was dropping 
  steadily so I moved old data not needed online to an 
 historic table. 
  Now the table has about 14 million rows. I don't need the 
 disk space 
  returned to the OS but I do need to improve performance. 
 Will a plain 
  vacuum do or is a vacuum full necessary?
  ¿Would a vacuum full improve performance at all?
 
  If you can afford the downtime on that table, cluster would be best.
 
  If not, do the normal vacuum and analyze.  This is unlikely 
 to improve 
  the performance much (although it may shrink the table _some_) but 
  regular vacuum will keep performance from getting any worse.
 
 Note that CLUSTER requires enough space to store the new and 
 the old copies of the table simultaneously. That's the main 
 reason for VACUUM FULL to still exist.
 
 There is also the option of doing something like (assuming id 
 is already an integer -- ie this doesn't actually change the data):
 
  ALTER TABLE x ALTER id TYPE integer USING id;
 
 which will rewrite the whole table. This is effectively the 
 same as CLUSTER except it doesn't order the table according 
 to an index. It will still require enough space to hold two 
 copies of the table but it will be significantly faster.
 

Yes, I can afford a downtime on Sunday.
Actually the clustering option would help since most of our slow queries use
the same index.

Thanks Bill and Gregory for the advice.
Regards,
Fernando.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Big delete on big table... now what?

2008-08-22 Thread Fernando Hevia
Hi list.
 
I have a table with over 30 million rows. Performance was dropping steadily
so I moved old data not needed online to an historic table. Now the table
has about 14 million rows. I don't need the disk space returned to the OS
but I do need to improve performance. Will a plain vacuum do or is a vacuum
full necessary?
¿Would a vacuum full improve performance at all?
 
Thanks for your hindsight.
Regards,
 
Fernando.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Distant mirroring

2008-08-11 Thread Fernando Hevia

 -Mensaje original-
 De: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] En nombre de dforums
 Enviado el: Lunes, 11 de Agosto de 2008 11:27
 Para: Scott Marlowe; pgsql-performance@postgresql.org
 Asunto: Re: [PERFORM] Distant mirroring
 
 Houlala
 
 I got headache !!!
 
 So please help...;;
 
 Assuming they all happen from 9 to 5 and during business 
 days only, that's about 86 transactions per second.  Well 
 within the realm of a single mirror set to keep up if you 
 don't make your db work real fat.
 
 OK i like, But my reality is that to make an insert of a 
 table that have
 27 millions of entrance it took 200 ms.
 so it took between 2 minutes and 10 minutes to treat 3000 
 records and dispatch/agregate in other tables. And I have for 
 now 2 records every 3 minutes.
 

You must try to partition that table. It should considerably speed up your
inserts.

 
 So I need a solution to be able to 1st supporting more 
 transaction, secondly I need to secure the data, and being 
 able to load balancing the charge.
 
 Please, give me any advise or suggestion that can help me.
 

Have you taken into consideration programming a solution on BerkeleyDB? Its
an API that provides a high-performance non-SQL database. With such a
solution you could achieve several thousands tps on a much smaller hardware.
You could use non-work hours to dump your data to Postgres for SQL support
for reporting and such.

Regards,
Fernando


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] With 4 disks should I go for RAID 5 or RAID 10

2007-12-26 Thread Fernando Hevia
Hi list,

I am building kind of a poor man's database server: 

Pentium D 945 (2 x 3 Ghz cores)
4 GB RAM
4 x 160 GB SATA II 7200 rpm (Intel server motherboard has only 4 SATA ports)

Database will be about 30 GB in size initially and growing 10 GB per year.
Data is inserted overnight in two big tables and during the day mostly
read-only queries are run. Parallelism is rare.

I have read about different raid levels with Postgres but the advice found
seems to apply on 8+ disks systems. With only four disks and performance in
mind should I build a RAID 10 or RAID 5 array? Raid 0 is overruled since
redundancy is needed.
I am going to use software Raid with Linux (Ubuntu Server 6.06). 

Thanks for any hindsight.
Regards,
Fernando.


Re: [PERFORM] With 4 disks should I go for RAID 5 or RAID 10

2007-12-26 Thread Fernando Hevia
Mark Mielke Wrote:

In my experience, software RAID 5 is horrible. Write performance can
decrease below the speed of one disk on its own, and read performance will
not be significantly more than RAID 1+0 as the number of stripes has only
increased from 2 to 3, and if reading while writing, you will not get 3X as
RAID 5 write requires at least two disks to be involved. I believe hardware
RAID 5 is also horrible, but since the hardware hides it from the
application, a hardware RAID 5 user might not care.

Software RAID 1+0 works fine on Linux with 4 disks. This is the setup I use
for my personal server.

I will use software RAID so RAID 1+0 seems to be the obvious choice.
Thanks for the advice!



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] With 4 disks should I go for RAID 5 or RAID 10

2007-12-26 Thread Fernando Hevia

 Bill Moran wrote:
 
 RAID 10.
 
 I snipped the rest of your message because none of it matters.  Never use
 RAID 5 on a database system.  Ever.  There is absolutely NO reason to
 every put yourself through that much suffering.  If you hate yourself
 that much just commit suicide, it's less drastic.
 

Well, that's a pretty strong argument. No suicide in my plans, gonna stick
to RAID 10. :)
Thanks.


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] With 4 disks should I go for RAID 5 or RAID 10

2007-12-26 Thread Fernando Hevia


 David Lang Wrote:
 
 with only four drives the space difference between raid 1+0 and raid 5
 isn't that much, but when you do a write you must write to two drives (the
 drive holding the data you are changing, and the drive that holds the
 parity data for that stripe, possibly needing to read the old parity data
 first, resulting in stalling for seek/read/calculate/seek/write since
 the drive moves on after the read), when you read you must read _all_
 drives in the set to check the data integrity.

Thanks for the explanation David. It's good to know not only what but also
why. Still I wonder why reads do hit all drives. Shouldn't only 2 disks be
read: the one with the data and the parity disk?

 
 for seek heavy workloads (which almost every database application is) the
 extra seeks involved can be murder on your performance. if your workload
 is large sequential reads/writes, and you can let the OS buffer things for
 you, the performance of raid 5 is much better.

Well, actually most of my application involves large sequential
reads/writes. The memory available for buffering (4GB) isn't bad either, at
least for my scenario. On the other hand I have got such strong posts
against RAID 5 that I doubt to even consider it.

 
 Linux software raid can do more then two disks in a mirror, so you may be
 able to get the added protection with raid 1 sets (again, probably not
 relavent to four drives), although there were bugs in this within the last
 six months or so, so you need to be sure your kernel is new enough to have
 the fix.
 

Well, here rises another doubt. Should I go for a single RAID 1+0 storing OS
+ Data + WAL files or will I be better off with two RAID 1 separating data
from OS + Wal files?

 now, if you can afford solid-state drives which don't have noticable seek
 times, things are completely different ;-)

Ha, sadly budget is very tight. :)

Regards,
Fernando.


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings