We moved our stuff out of AWS a little over a year ago because the
performance was crazy inconsistent and unpredictable.  I think they do a
lot of oversubscribing so you get strange sawtooth performance patterns
depending on who else is sharing your infrastructure and what they are
doing at the time.

The same unit of work would take 20 minutes each for several hours, and
then take 2 1/2 hours each for a day, and then back to 20 minutes, and
sometimes anywhere in between for hours or days at a stretch.  I could
never tell the business when the processing would be done, which made it
hard for them to set expectations with customers, promise deliverables, or
manage the business.  Smaller nodes seemed to be worse than larger nodes, I
only have theories as to why.  I never got good support from AWS to help me
figure out what was happening.

My first thought is to run the same test on different days of the week and
different times of day to see if the numbers change radically.  Maybe spin
up a node in another data center and availability zone and try the test
there too.

My real suggestion is to move to Google Cloud or Rackspace or Digital Ocean
or somewhere other than AWS.   (We moved to Google Cloud and have been very
happy there.  The performance is much more consistent, the management UI is
more intuitive, AND the cost for equivalent infrastructure is lower too.)

On Wed, Jan 31, 2018 at 7:03 AM, Vitaliy Garnashevich <
vgarnashev...@gmail.com> wrote:

> Hi,
> I've tried to run a benchmark, similar to this one:
> https://www.postgresql.org/message-id/flat/CAHyXU0yiVvfQAnR9
> cyH%3DHWh1WbLRsioe%3DmzRJTHwtr%3D2azsTdQ%40mail.
> gmail.com#CAHyXU0yiVvfQAnR9cyH=HWh1WbLRsioe=mzRJTHwtr=2azsTd
> q...@mail.gmail.com
> CREATE TABLESPACE test OWNER postgres LOCATION '/path/to/ebs';
> pgbench -i -s 1000 --tablespace=test pgbench
> echo "" >test.txt
> for i in 0 1 2 4 8 16 32 64 128 256 ; do
>   sync; echo 3 > /proc/sys/vm/drop_caches; service postgresql restart
>   echo "effective_io_concurrency=$i" >>test.txt
>   psql pgbench -c "set effective_io_concurrency=$i; set
> enable_indexscan=off; explain (analyze, buffers)  select * from
> pgbench_accounts where aid between 1000 and 10000000 and abalance != 0;"
> >>test.txt
> done
> I get the following results:
> effective_io_concurrency=0
>  Execution time: 40262.781 ms
> effective_io_concurrency=1
>  Execution time: 98125.987 ms
> effective_io_concurrency=2
>  Execution time: 55343.776 ms
> effective_io_concurrency=4
>  Execution time: 52505.638 ms
> effective_io_concurrency=8
>  Execution time: 54954.024 ms
> effective_io_concurrency=16
>  Execution time: 54346.455 ms
> effective_io_concurrency=32
>  Execution time: 55196.626 ms
> effective_io_concurrency=64
>  Execution time: 55057.956 ms
> effective_io_concurrency=128
>  Execution time: 54963.510 ms
> effective_io_concurrency=256
>  Execution time: 54339.258 ms
> The test was using 100 GB gp2 SSD EBS. More detailed query plans are
> attached.
> PostgreSQL 9.6.6 on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu
> 5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609, 64-bit
> The results look really confusing to me in two ways. The first one is that
> I've seen recommendations to set effective_io_concurrency=256 (or more) on
> EBS. The other one is that effective_io_concurrency=1 (the worst case) is
> actually the default for PostgreSQL on Linux.
> Thoughts?
> Regards,
> Vitaliy

Reply via email to