I've tried to re-run the test for some specific values of
effective_io_concurrency. The results were the same.
That's why I don't think the order of tests or variability in "hardware"
performance affected the results.
On 31/01/2018 15:01, Rick Otten wrote:
We moved our stuff out of AWS a little over a year ago because the
performance was crazy inconsistent and unpredictable. I think they do
a lot of oversubscribing so you get strange sawtooth performance
patterns depending on who else is sharing your infrastructure and what
they are doing at the time.
The same unit of work would take 20 minutes each for several hours,
and then take 2 1/2 hours each for a day, and then back to 20 minutes,
and sometimes anywhere in between for hours or days at a stretch. I
could never tell the business when the processing would be done, which
made it hard for them to set expectations with customers, promise
deliverables, or manage the business. Smaller nodes seemed to be
worse than larger nodes, I only have theories as to why. I never got
good support from AWS to help me figure out what was happening.
My first thought is to run the same test on different days of the week
and different times of day to see if the numbers change radically.
Maybe spin up a node in another data center and availability zone and
try the test there too.
My real suggestion is to move to Google Cloud or Rackspace or Digital
Ocean or somewhere other than AWS. (We moved to Google Cloud and
have been very happy there. The performance is much more consistent,
the management UI is more intuitive, AND the cost for equivalent
infrastructure is lower too.)
On Wed, Jan 31, 2018 at 7:03 AM, Vitaliy Garnashevich
<vgarnashev...@gmail.com <mailto:vgarnashev...@gmail.com>> wrote:
I've tried to run a benchmark, similar to this one:
CREATE TABLESPACE test OWNER postgres LOCATION '/path/to/ebs';
pgbench -i -s 1000 --tablespace=test pgbench
echo "" >test.txt
for i in 0 1 2 4 8 16 32 64 128 256 ; do
sync; echo 3 > /proc/sys/vm/drop_caches; service postgresql restart
echo "effective_io_concurrency=$i" >>test.txt
psql pgbench -c "set effective_io_concurrency=$i; set
enable_indexscan=off; explain (analyze, buffers) select * from
pgbench_accounts where aid between 1000 and 10000000 and abalance
!= 0;" >>test.txt
I get the following results:
Execution time: 40262.781 ms
Execution time: 98125.987 ms
Execution time: 55343.776 ms
Execution time: 52505.638 ms
Execution time: 54954.024 ms
Execution time: 54346.455 ms
Execution time: 55196.626 ms
Execution time: 55057.956 ms
Execution time: 54963.510 ms
Execution time: 54339.258 ms
The test was using 100 GB gp2 SSD EBS. More detailed query plans
PostgreSQL 9.6.6 on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu
5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609, 64-bit
The results look really confusing to me in two ways. The first one
is that I've seen recommendations to set
effective_io_concurrency=256 (or more) on EBS. The other one is
that effective_io_concurrency=1 (the worst case) is actually the
default for PostgreSQL on Linux.