Re: [PERFORM] Incredibly slow restore times after 9.09.2 upgrade

2014-11-04 Thread jmcdonagh
Thanks for the confirmation Jerry. 



--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Incredibly-slow-restore-times-after-9-0-9-2-upgrade-tp5824701p5825615.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Incredibly slow restore times after 9.09.2 upgrade

2014-11-04 Thread jmcdonagh
Tomas Vondra wrote
 On 29.10.2014 16:12, jmcdonagh wrote:
 Hi Tomas- thank you for your thoughtful response!
 
 
 Tomas Vondra wrote
 On 28.10.2014 21:55, jmcdonagh wrote:
 Hi, we have a nightly job that restores current production data to
 the development databases in a 'warm spare' database so that if the
 developers need fresh data, it's ready during the day. When we moved
 from 9.0 to 9.2 suddenly the restores began to take from a few hours
 to more like 15 hours or so. We're in Amazon EC2, I've tried new EBS
 volumes, warmed them up, threw IOPS at them, pretty much all the
 standard stuff to get more disk performance.

 So, if I understand it correctly, you've been restoring into 9.0, then
 you switched to 9.2 and it's much slower?
 
 Yes- but since the move was done utilizing snapshots so the move
 involves new volumes, but I have created new volumes since then to
 rule out a single bad volume.
 
 My advice would be to do some basic low-level performance tests to rule
 this out. Use dd or (better) fio to test basic I/O performance, it's
 much easier to spot issues that way.

I've done dd tests and the volumes perform fine. 


Tomas Vondra wrote
 Tomas Vondra wrote
 Is the 9.2 configured equally to 9.0? If you do something like this

   SELECT name, setting
 FROM pg_settings
WHERE source = 'configuration file';

 on both versions, what do you get?
 
 I no longer have the 9.0 box up but we do track configuration via
 puppet and git. The only configuration change made for 9.2 is:
 
 -#standard_conforming_strings = off
 +standard_conforming_strings = off
 
 Compared to 9.0, I suppose? Anyway, post the non-default config values
 at least for 9.2, please.

Yea, so in comparison to the only change was that. Here are the non-default
settings (some of them are probably defaults, but these are the uncommented
lines from postgresql.conf):

data_directory = '/mnt/postgresql/9.2/main'  # use data in another
directory
hba_file = '/etc/postgresql/9.2/main/pg_hba.conf'   # host-based
authentication file
ident_file = '/etc/postgresql/9.2/main/pg_ident.conf'   # ident
configuration file
external_pid_file = '/var/run/postgresql/9.2-main.pid'  # write an extra
PID file
listen_addresses = '*'# what IP address(es) to listen on;
port = 5432# (change requires restart)
max_connections = 300 # (change requires restart)
unix_socket_directory = '/var/run/postgresql'  # (change requires
restart)
ssl = true# (change requires restart)
shared_buffers = 4GB # min 128kB
temp_buffers = 128MB # min 800kB
work_mem = 256MB# min 64kB
maintenance_work_mem = 256MB  # min 1MB
wal_buffers = 512kB # min 32kB
commit_delay = 5 # range 0-10, in microseconds
commit_siblings = 1 # range 1-1000
random_page_cost = 2.0 # same scale as above
effective_cache_size = 16GB
from_collapse_limit = 10
join_collapse_limit = 10  # 1 disables collapsing of explicit
log_destination = 'stderr'  # Valid values are combinations of
client_min_messages = warning  # values in order of decreasing detail:
log_min_messages = warning  # values in order of decreasing detail:
log_min_duration_statement = 1000   # -1 is disabled, 0 logs all statements
log_line_prefix = '%t ' # special values:
autovacuum = on # Enable autovacuum subprocess?  'on'
datestyle = 'iso, mdy'
timezone = EST5EDT # actually, defaults to TZ environment
client_encoding = sql_ascii  # actually, defaults to database
lc_messages = 'en_US.UTF-8' # locale for system error message
lc_monetary = 'en_US.UTF-8' # locale for monetary formatting
lc_numeric = 'en_US.UTF-8' # locale for number formatting
lc_time = 'en_US.UTF-8'# locale for time formatting
default_text_search_config = 'pg_catalog.english'
standard_conforming_strings = off


Tomas Vondra wrote
 Tomas Vondra wrote
 Here's the thing, the disk isn't saturated. The behavior I'm seeing 
 seems very odd to me; I'm seeing the source disk which holds the dump
 saturated by reads, which is great, but then I just see nothing being
 written to the postgres volume. Just nothing happening, then a
 small burst. There is no write queue backup on the destination disk
 either. if I look at pg_stat_activity I'll see something like:

 COPY salesforce_reconciliation (salesforce_id, email,
 advisor_salesforce_id, processed) FROM stdin

 and even for small tables, that seems to take a very long time even
 though the destination disk is almost at 0 utilization.

 So, where's the bottleneck? Clearly, there's one, so is it a CPU, a
 disk or something else? Or maybe network, because you're using EBS?

 What do you mean by 'utilization'? How do you measure that?
 
 The bottleneck is I/O somehow. I say somehow, because I see iowait 
 averaging about 50% between two CPUs, but there is just no writes to 
 the destination EBS volume really happening, just reads from the
 disk where 

Re: [PERFORM] Incredibly slow restore times after 9.09.2 upgrade

2014-10-30 Thread jmcdonagh
I just had a thought- I know some of these tables are in need of a vacuuming.
Could it be that the dump is dumping a bunch of garbage that the restore has
to sift through on the restore? I don't know enough details to know if this
is a dumb thought or not.

The restore to RDS took roughly the same amount of time. My next move is to
try on a fast instance store, and also do a postgres 9 restore of a pure SQL
dump, but that won't really be a great test since I use custom format. I'm
assuming here that I can't take the custom dump from 9.2 and apply it to
9.0, or can I?



--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Incredibly-slow-restore-times-after-9-0-9-2-upgrade-tp5824701p5825052.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Incredibly slow restore times after 9.09.2 upgrade

2014-10-30 Thread Jerry Sievers
jmcdonagh joseph.e.mcdon...@gmail.com writes:

 I just had a thought- I know some of these tables are in need of a vacuuming.
 Could it be that the dump is dumping a bunch of garbage that the restore has
 to sift through on the restore? I don't know enough details to know if this
 is a dumb thought or not.

No.  However it's true that the dump will take a bit longer having to
scan a bloated table rather than a tight one.

Dump will only output the live rows.  psql or pg_restore whatever you're
using on the target side will not have to step over any junk.

HTH


 The restore to RDS took roughly the same amount of time. My next move is to
 try on a fast instance store, and also do a postgres 9 restore of a pure SQL
 dump, but that won't really be a great test since I use custom format. I'm
 assuming here that I can't take the custom dump from 9.2 and apply it to
 9.0, or can I?



 --
 View this message in context: 
 http://postgresql.1045698.n5.nabble.com/Incredibly-slow-restore-times-after-9-0-9-2-upgrade-tp5824701p5825052.html
 Sent from the PostgreSQL - performance mailing list archive at Nabble.com.

-- 
Jerry Sievers
Postgres DBA/Development Consulting
e: postgres.consult...@comcast.net
p: 312.241.7800


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Incredibly slow restore times after 9.09.2 upgrade

2014-10-30 Thread Tomas Vondra
On 29.10.2014 16:12, jmcdonagh wrote:
 Hi Tomas- thank you for your thoughtful response!
 
 
 Tomas Vondra wrote
 On 28.10.2014 21:55, jmcdonagh wrote:
 Hi, we have a nightly job that restores current production data to
 the development databases in a 'warm spare' database so that if the
 developers need fresh data, it's ready during the day. When we moved
 from 9.0 to 9.2 suddenly the restores began to take from a few hours
 to more like 15 hours or so. We're in Amazon EC2, I've tried new EBS
 volumes, warmed them up, threw IOPS at them, pretty much all the
 standard stuff to get more disk performance.

 So, if I understand it correctly, you've been restoring into 9.0, then
 you switched to 9.2 and it's much slower?
 
 Yes- but since the move was done utilizing snapshots so the move
 involves new volumes, but I have created new volumes since then to
 rule out a single bad volume.

My advice would be to do some basic low-level performance tests to rule
this out. Use dd or (better) fio to test basic I/O performance, it's
much easier to spot issues that way.

 Tomas Vondra wrote
 Is the 9.2 configured equally to 9.0? If you do something like this

   SELECT name, setting
 FROM pg_settings
WHERE source = 'configuration file';

 on both versions, what do you get?
 
 I no longer have the 9.0 box up but we do track configuration via
 puppet and git. The only configuration change made for 9.2 is:
 
 -#standard_conforming_strings = off
 +standard_conforming_strings = off

Compared to 9.0, I suppose? Anyway, post the non-default config values
at least for 9.2, please.

 Cause we have an old app that needs this setting on otherwise we'd
 spend a lot of time trying to fix it.

I doubt standard_conforming_strings has anything to do with the issues.

 Tomas Vondra wrote
 Here's the thing, the disk isn't saturated. The behavior I'm seeing 
 seems very odd to me; I'm seeing the source disk which holds the dump
 saturated by reads, which is great, but then I just see nothing being
 written to the postgres volume. Just nothing happening, then a
 small burst. There is no write queue backup on the destination disk
 either. if I look at pg_stat_activity I'll see something like:

 COPY salesforce_reconciliation (salesforce_id, email,
 advisor_salesforce_id, processed) FROM stdin

 and even for small tables, that seems to take a very long time even
 though the destination disk is almost at 0 utilization.

 So, where's the bottleneck? Clearly, there's one, so is it a CPU, a
 disk or something else? Or maybe network, because you're using EBS?

 What do you mean by 'utilization'? How do you measure that?
 
 The bottleneck is I/O somehow. I say somehow, because I see iowait 
 averaging about 50% between two CPUs, but there is just no writes to 
 the destination EBS volume really happening, just reads from the
 disk where the source dump is located, then bursts of writes to the
 destination volume every so often. It's kind of puzzling. This is
 happening on multiple database servers, in multiple availability
 zones. Driving me bonkers.
 
 What I mean by utilization is util% from iostat -m -x 1.

I find this rather contradictory. At one moment you say the disk isn't
saturated, the next moment you say you're I/O bound.

Also, iowait (as reported e.g. by 'top') is tricky to interpret
correctly, especially on multi-cpu systems (nice intro to the complexity
[1]). It's really difficult to interpret the 50% iowait without more
info about what's happening on the machine.

IMHO, the utilization (as reported by iotop) is much easier to
interpret, because it means '% of time the device was servicing
requests'. It has issues too, because 100% does not mean 'saturated'
(especially on RAID arrays that can service multiple requests in
parallel), but it's better than iowait.

If I had to guess based from your info, I'd bet you're CPU bound, so
there's very little idle time and about 50% of it is spent waiting for
I/O requests (hence the 50% iowait). But in total the amount of I/O is
very small, so %util is ~0.

Please, post a few lines of 'iostat -x -k 1' output. Samples from 'top'
and 'vmstat 1' would be handy too.

regards
Tomas

[1] http://veithen.blogspot.cz/2013/11/iowait-linux.html



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Incredibly slow restore times after 9.09.2 upgrade

2014-10-29 Thread jmcdonagh
Hi Tomas- thank you for your thoughtful response!


Tomas Vondra wrote
 On 28.10.2014 21:55, jmcdonagh wrote:
 Hi, we have a nightly job that restores current production data to
 the development databases in a 'warm spare' database so that if the
 developers need fresh data, it's ready during the day. When we moved
 from 9.0 to 9.2 suddenly the restores began to take from a few hours
 to more like 15 hours or so. We're in Amazon EC2, I've tried new EBS
 volumes, warmed them up, threw IOPS at them, pretty much all the
 standard stuff to get more disk performance.
 
 So, if I understand it correctly, you've been restoring into 9.0, then
 you switched to 9.2 and it's much slower?

Yes- but since the move was done utilizing snapshots so the move involves
new volumes, but I have created new volumes since then to rule out a single
bad volume.


Tomas Vondra wrote
 Is the 9.2 configured equally to 9.0? If you do something like this
 
   SELECT name, setting
 FROM pg_settings
WHERE source = 'configuration file';
 
 on both versions, what do you get?

I no longer have the 9.0 box up but we do track configuration via puppet and
git. The only configuration change made for 9.2 is:

-#standard_conforming_strings = off
+standard_conforming_strings = off

Cause we have an old app that needs this setting on otherwise we'd spend a
lot of time trying to fix it.


Tomas Vondra wrote
 Here's the thing, the disk isn't saturated. The behavior I'm seeing 
 seems very odd to me; I'm seeing the source disk which holds the dump
 saturated by reads, which is great, but then I just see nothing being
 written to the postgres volume. Just nothing happening, then a
 small burst. There is no write queue backup on the destination disk
 either. if I look at pg_stat_activity I'll see something like:
 
 COPY salesforce_reconciliation (salesforce_id, email,
 advisor_salesforce_id, processed) FROM stdin
 
 and even for small tables, that seems to take a very long time even
 though the destination disk is almost at 0 utilization.
 
 So, where's the bottleneck? Clearly, there's one, so is it a CPU, a disk
 or something else? Or maybe network, because you're using EBS?
 
 What do you mean by 'utilization'? How do you measure that?

The bottleneck is I/O somehow. I say somehow, because I see iowait averaging
about 50% between two CPUs, but there is just no writes to the destination
EBS volume really happening, just reads from the disk where the source dump
is located, then bursts of writes to the destination volume every so often.
It's kind of puzzling. This is happening on multiple database servers, in
multiple availability zones. Driving me bonkers.

What I mean by utilization is util% from iostat -m -x 1.





--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Incredibly-slow-restore-times-after-9-0-9-2-upgrade-tp5824701p5824847.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Incredibly slow restore times after 9.09.2 upgrade

2014-10-29 Thread Mathis, Jason
Is the instance ebs-optimized? I am wondering if its a configuration on the
instance not postgres or ebs.

On Wed, Oct 29, 2014 at 10:12 AM, jmcdonagh joseph.e.mcdon...@gmail.com
wrote:

 Hi Tomas- thank you for your thoughtful response!


 Tomas Vondra wrote
  On 28.10.2014 21:55, jmcdonagh wrote:
  Hi, we have a nightly job that restores current production data to
  the development databases in a 'warm spare' database so that if the
  developers need fresh data, it's ready during the day. When we moved
  from 9.0 to 9.2 suddenly the restores began to take from a few hours
  to more like 15 hours or so. We're in Amazon EC2, I've tried new EBS
  volumes, warmed them up, threw IOPS at them, pretty much all the
  standard stuff to get more disk performance.
 
  So, if I understand it correctly, you've been restoring into 9.0, then
  you switched to 9.2 and it's much slower?

 Yes- but since the move was done utilizing snapshots so the move involves
 new volumes, but I have created new volumes since then to rule out a single
 bad volume.


 Tomas Vondra wrote
  Is the 9.2 configured equally to 9.0? If you do something like this
 
SELECT name, setting
  FROM pg_settings
 WHERE source = 'configuration file';
 
  on both versions, what do you get?

 I no longer have the 9.0 box up but we do track configuration via puppet
 and
 git. The only configuration change made for 9.2 is:

 -#standard_conforming_strings = off
 +standard_conforming_strings = off

 Cause we have an old app that needs this setting on otherwise we'd spend a
 lot of time trying to fix it.


 Tomas Vondra wrote
  Here's the thing, the disk isn't saturated. The behavior I'm seeing
  seems very odd to me; I'm seeing the source disk which holds the dump
  saturated by reads, which is great, but then I just see nothing being
  written to the postgres volume. Just nothing happening, then a
  small burst. There is no write queue backup on the destination disk
  either. if I look at pg_stat_activity I'll see something like:
 
  COPY salesforce_reconciliation (salesforce_id, email,
  advisor_salesforce_id, processed) FROM stdin
 
  and even for small tables, that seems to take a very long time even
  though the destination disk is almost at 0 utilization.
 
  So, where's the bottleneck? Clearly, there's one, so is it a CPU, a disk
  or something else? Or maybe network, because you're using EBS?
 
  What do you mean by 'utilization'? How do you measure that?

 The bottleneck is I/O somehow. I say somehow, because I see iowait
 averaging
 about 50% between two CPUs, but there is just no writes to the destination
 EBS volume really happening, just reads from the disk where the source dump
 is located, then bursts of writes to the destination volume every so often.
 It's kind of puzzling. This is happening on multiple database servers, in
 multiple availability zones. Driving me bonkers.

 What I mean by utilization is util% from iostat -m -x 1.





 --
 View this message in context:
 http://postgresql.1045698.n5.nabble.com/Incredibly-slow-restore-times-after-9-0-9-2-upgrade-tp5824701p5824847.html
 Sent from the PostgreSQL - performance mailing list archive at Nabble.com.


 --
 Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-performance



Re: [PERFORM] Incredibly slow restore times after 9.09.2 upgrade

2014-10-29 Thread jmcdonagh
Hi Jason, oddly enough the setting on or off does not affect this particular
issue. As a rule I generally enable this option on my instances that support
it. I recently tried upping the nodes to the latest generation (m3) to try
and rectify/improve this issue. Unfortunately right now m3 won't work
because we rely on a lot of space in mnt for temporary data work and the new
instances don't have much space there (though it is much faster). So I went
back to m1.large and left EBS optimized off. I'm not seeing any note-worthy
change in performance.

So I went and fired up an RDS postgres instance here. Eventually I want to
move to RDS anyways, but it's not a good short term solution to the right
now issue. Restore is running now, so I'll know within the next day or so if
this is much faster.

I'm puzzled by the change from 9.0 to 9.2 coinciding with this though.
Before the upgrade this job never had any issues. But I am 100% aware that
could be a red herring.



--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Incredibly-slow-restore-times-after-9-0-9-2-upgrade-tp5824701p5824871.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Incredibly slow restore times after 9.09.2 upgrade

2014-10-28 Thread jmcdonagh
Hi, we have a nightly job that restores current production data to the
development databases in a 'warm spare' database so that if the developers
need fresh data, it's ready during the day. When we moved from 9.0 to 9.2
suddenly the restores began to take from a few hours to more like 15 hours
or so. We're in Amazon EC2, I've tried new EBS volumes, warmed them up,
threw IOPS at them, pretty much all the standard stuff to get more disk
performance.

Here's the thing, the disk isn't saturated. The behavior I'm seeing seems
very odd to me; I'm seeing the source disk which holds the dump saturated by
reads, which is great, but then I just see nothing being written to the
postgres volume. Just nothing happening, then a small burst. There is no
write queue backup on the destination disk either. if I look at
pg_stat_activity I'll see something like:

COPY salesforce_reconciliation (salesforce_id, email, advisor_salesforce_id,
processed) FROM stdin

and even for small tables, that seems to take a very long time even though
the destination disk is almost at 0 utilization.

The dumps are created with pg_dump -Fc and restored with pg_restore -d db -j
2 -O -U postgres PostgreSQL-db.sql.

Is it possible that some default settings were changed from 9.0 to 9.2 that
would cause this kind of behavior? I'm stumped here. Thanks in advance for
any consideration here.



--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Incredibly-slow-restore-times-after-9-0-9-2-upgrade-tp5824701.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Incredibly slow restore times after 9.09.2 upgrade

2014-10-28 Thread Tomas Vondra
On 28.10.2014 21:55, jmcdonagh wrote:
 Hi, we have a nightly job that restores current production data to
 the development databases in a 'warm spare' database so that if the
 developers need fresh data, it's ready during the day. When we moved
 from 9.0 to 9.2 suddenly the restores began to take from a few hours
 to more like 15 hours or so. We're in Amazon EC2, I've tried new EBS
 volumes, warmed them up, threw IOPS at them, pretty much all the
 standard stuff to get more disk performance.

So, if I understand it correctly, you've been restoring into 9.0, then
you switched to 9.2 and it's much slower?

Is the 9.2 configured equally to 9.0? If you do something like this

  SELECT name, setting
FROM pg_settings
   WHERE source = 'configuration file';

on both versions, what do you get?

 Here's the thing, the disk isn't saturated. The behavior I'm seeing 
 seems very odd to me; I'm seeing the source disk which holds the dump
 saturated by reads, which is great, but then I just see nothing being
 written to the postgres volume. Just nothing happening, then a
 small burst. There is no write queue backup on the destination disk
 either. if I look at pg_stat_activity I'll see something like:
 
 COPY salesforce_reconciliation (salesforce_id, email,
 advisor_salesforce_id, processed) FROM stdin
 
 and even for small tables, that seems to take a very long time even
 though the destination disk is almost at 0 utilization.

So, where's the bottleneck? Clearly, there's one, so is it a CPU, a disk
or something else? Or maybe network, because you're using EBS?

What do you mean by 'utilization'? How do you measure that?


 The dumps are created with pg_dump -Fc and restored with pg_restore
 -d db -j 2 -O -U postgres PostgreSQL-db.sql.

OK

 Is it possible that some default settings were changed from 9.0 to 
 9.2 that would cause this kind of behavior? I'm stumped here. Thanks
 in advance for any consideration here.

I doubt that. There probably were some changes (after all, we're talking
about 2 major versions), but we generally don't change it in a way
that'd hurt performance.

regards
Tomas



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance