On Fri, Apr 12, 2013 at 1:17 PM, Jason Joseph <[email protected]> wrote: > We've been doing some hefty performance testing of the Heroku Postgres dbs > with the help of some of the people on the data team (thank you guys!) and > we just wanted to share some rough performance numbers that hopefully might > help you when you're selecting a db type next time.
Thank you for doing this and compiling the results. > Setup: We have some large ETL operations that obviously are very write heavy > and we wanted to get an idea which db tier was best in terms > price/performance. We're using the COPY command to move 2 million rows of > data from one db (a zilla) to another and we wanted to get an idea of what > the performance differences were between the tiers. The Zilla that is being > copied from doesn't change between tests and for each new test we spin up a > new db to make sure we are testing from a clean slate. > > DB Type - Time to copy in minutes > > Ronin - 35:04 > Fugu - 26:43 > Ika - 8:34 > Zilla - 7:22 > > We didn't test a mecha (yet) or anything below Ronin because the COPY > literally wouldn't complete on anything lower than that. Wouldn't complete? That strikes me as surprising. Also, what was the aggregate size of the loaded data? Figures involving number of records processed can vary rather widely depending on data types (some have more expensive parsing routines) and width of the row (the aggregate size lets one move backwards to average width). Normally, parsing routine overhead is not a big deal as compared to I/O bottlenecks, but there are some data types where this effect is probably measurable. > One thing we did realize with the help of the Heroku PG team is that there > is some serious variance in performance on newly provisioned dbs. They are > currently hypothesizing that it is related to the nature of the dynamic disk > space provisioning that happens as large amounts of data are inserted into a > freshly provisioned db. We saw in many instances close to 50% slower speeds. > Sometimes it would run normally (those are the values we listed above) but > for example on the Ika runs, we would sometimes see it take close to ~15 > minutes to complete the copy. After some initial hiccups though we believe > all dbs move towards there normal performance numbers listed above so just > keep that in mind when you are provisioning a new db. Yeah, there are perhaps two causes for this: one is that a freshly awakened storage volume may be in some varying state of live-ness. one of our staff did experiments not that long ago suggest the performance stabilizes fairly rapidly, but in the first few minutes after creation I imagine one could be extra-sensitive to those variations. There are the EBS-PRIOPS-variety volumes, which seem to have been otherwise a good improvement all-around for most people. A fix there would be to pre-warm the volumes. Not precisely rocket science, but to do so in a way that would play nicely with dynamic disk addition this would decrease flexibility in tweaking the sizes of volume to add (because the pool would have a fixed inventory available, as such disks would take time to test) and is probably a finicky piece of work, given our experience with using some of the APIs involved and the way they can act strangely at the margins. It could also be something else, of course. But these are the theories in play as of the moment. It's interesting that the blunt impact can make an 8 minute difference, on 16 minutes running: the absolute size is the effect is larger than I would have guessed. A way to partially eliminate some of these problems in the experiment is to truncate-away the loaded data and try again, and seeing if standard deviation takes a dive. It would also be interesting to see if the low-performing runs tend to cluster towards the first such run, and whether the convergence tends to occur in one direction, e.g. low-performers become high performers, or vice versa. There is still some room for other explanations, too: for example, if one's ETL issues a lot of queries whose processing is fast then network round trip time could figure into this. I surmise that's not the case here, but it does cause a difference once in a while. > Anyways, hope this was helpful and we would love to hear from anyone else > doing performance testing on the PG dbs. Me too. -- -- You received this message because you are subscribed to the Google Groups "Heroku" group. To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/heroku?hl=en_US?hl=en --- You received this message because you are subscribed to the Google Groups "Heroku Community" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/groups/opt_out.
