Re: Heroku Postgres Performance Notes

Jason Joseph Wed, 17 Apr 2013 10:44:09 -0700


On Friday, April 12, 2013 6:20:09 PM UTC-7, Daniel Farina wrote:
>
> On Fri, Apr 12, 2013 at 1:17 PM, Jason Joseph 
> <[email protected]<javascript:>> 
> wrote: 
> > We've been doing some hefty performance testing of the Heroku Postgres 
> dbs 
> > with the help of some of the people on the data team (thank you guys!) 
> and 
> > we just wanted to share some rough performance numbers that hopefully 
> might 
> > help you when you're selecting a db type next time. 
>
> Thank you for doing this and compiling the results. 
>
> > Setup: We have some large ETL operations that obviously are very write 
> heavy 
> > and we wanted to get an idea which db tier was best in terms 
> > price/performance. We're using the COPY command to move 2 million rows 
> of 
> > data from one db (a zilla) to another and we wanted to get an idea of 
> what 
> > the performance differences were between the tiers. The Zilla that is 
> being 
> > copied from doesn't change between tests and for each new test we spin 
> up a 
> > new db to make sure we are testing from a clean slate. 
> > 
> > DB Type - Time to copy in minutes 
> > 
> > Ronin - 35:04 
> > Fugu - 26:43 
> > Ika - 8:34 
> > Zilla - 7:22 
> > 
> > We didn't test a mecha (yet) or anything below Ronin because the COPY 
> > literally wouldn't complete on anything lower than that. 
>
> Wouldn't complete?  That strikes me as surprising. 
>


We only tested once or twice and it kept running well past an hour so we 
gave up after a while. We could test again, maybe it was just very slow.
 

>
> Also, what was the aggregate size of the loaded data?  Figures 
> involving number of records processed can vary rather widely depending 
> on data types (some have more expensive parsing routines) and width of 
> the row (the aggregate size lets one move backwards to average width). 
>
> Normally, parsing routine overhead is not a big deal as compared to 
> I/O bottlenecks, but there are some data types where this effect is 
> probably measurable. 
>

The data was ~200 columns of data, strings and integers mainly with a few 
dates as well. Rough size of the 2 million rows was ~4gb.
 

>
> > One thing we did realize with the help of the Heroku PG team is that 
> there 
> > is some serious variance in performance on newly provisioned dbs. They 
> are 
> > currently hypothesizing that it is related to the nature of the dynamic 
> disk 
> > space provisioning that happens as large amounts of data are inserted 
> into a 
> > freshly provisioned db. We saw in many instances close to 50% slower 
> speeds. 
> > Sometimes it would run normally (those are the values we listed above) 
> but 
> > for example on the Ika runs, we would sometimes see it take close to ~15 
> > minutes to complete the copy. After some initial hiccups though we 
> believe 
> > all dbs move towards there normal performance numbers listed above so 
> just 
> > keep that in mind when you are provisioning a new db. 
>
> Yeah, there are perhaps two causes for this: one is that a freshly 
> awakened storage volume may be in some varying state of live-ness. 
> one of our staff did experiments not that long ago suggest the 
> performance stabilizes fairly rapidly, but in the first few minutes 
> after creation I imagine one could be extra-sensitive to those 
> variations.  There are the EBS-PRIOPS-variety volumes, which seem to 
> have been otherwise a good improvement all-around for most people. 
>
> A fix there would be to pre-warm the volumes.  Not precisely rocket 
> science, but to do so in a way that would play nicely with dynamic 
> disk addition this would decrease flexibility in tweaking the sizes of 
> volume to add (because the pool would have a fixed inventory 
> available, as such disks would take time to test) and is probably a 
> finicky piece of work, given our experience with using some of the 
> APIs involved and the way they can act strangely at the margins. 
>
> It could also be something else, of course.  But these are the 
> theories in play as of the moment.  It's interesting that the blunt 
> impact can make an 8 minute difference, on 16 minutes running: the 
> absolute size is the effect is larger than I would have guessed. 
>

We were very surprised as well which is why we reached out initially 
regarding this.
 

>
> A way to partially eliminate some of these problems in the experiment 
> is to truncate-away the loaded data and try again, and seeing if 
> standard deviation takes a dive.  It would also be interesting to see 
> if the low-performing runs tend to cluster towards the first such run, 
> and whether the convergence tends to occur in one direction, e.g. 
> low-performers become high performers, or vice versa. 
>

We realized after the tests that deleting rows didn't fix the performance 
issues but that truncating definitely helped immensely. Unfortunately we 
always did one run on a freshly provisioned db and never tried subsequent 
runs on low or high performing runs but our other experience leads me to 
think that even initially low performing instances would trend towards 
higher (normal) performing ones with subsequent runs if truncates were run 
between each run.
 

>
> There is still some room for other explanations, too: for example, if 
> one's ETL issues a lot of queries whose processing is fast then 
> network round trip time could figure into this.  I surmise that's not 
> the case here, but it does cause a difference once in a while. 
>

I assume this isn't the case with our tests otherwise the performance would 
be fairly similar across multiple db types but I am not 100% sure on this. 
This definitely could be an issue though with other tests though.
 

>
> > Anyways, hope this was helpful and we would love to hear from anyone 
> else 
> > doing performance testing on the PG dbs. 
>
> Me too. 
>

Thank you again to the PG Data team for all of the help regarding these 
tests, it has been invaluable for us. 

-- 
-- 
You received this message because you are subscribed to the Google
Groups "Heroku" group.

To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/heroku?hl=en_US?hl=en

--- 
You received this message because you are subscribed to the Google Groups 
"Heroku Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.

Re: Heroku Postgres Performance Notes

Reply via email to