Hi Samuel,
The SSDs were used as a cache for the spinning drives. Here is a 30-second
iostat sample representative of the whole run:
avg-cpu: %user %nice %system %iowait %steal %idle
24.87 0.00 12.54 62.39 0.00 0.20
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz
avgqu-sz await svctm %util
sdd 0.00 137.37 3058.40 106.17 34691.60 974.13 22.54
15.75 4.98 0.32 100.00
sde 0.00 136.07 3063.37 107.70 35267.07 975.07 22.86
15.58 4.92 0.32 100.00
sdf 0.00 135.37 3064.23 109.53 35815.60 979.60 23.19
15.82 4.99 0.32 100.00
sdg 0.00 136.97 3066.57 116.67 35196.53 1014.53 22.75
15.87 4.99 0.31 100.00
sdi 0.00 2011.03 0.00 87.90 0.00 8395.73 191.03
0.13 1.45 1.42 12.51
sdk 0.00 136.63 3066.83 107.53 35805.07 976.67 23.17
16.01 5.04 0.32 100.00
sdm 0.00 138.50 3054.40 111.10 34674.27 998.40 22.54
15.52 4.91 0.32 100.00
sdj 0.00 136.73 3058.70 118.20 35227.20 1019.73 22.82
15.81 4.98 0.31 100.00
sdl 0.00 137.53 3044.97 109.33 34448.00 987.47 22.47
15.78 5.00 0.32 100.00
The data and index tablespaces were striped across the 8 LUNs, and saw an
average 5ms response. We can beef up the storage to handle more I/Os so that
our utilization doesn't stay below 40%, but that misses the point: we have an
I/O rate twice the commercial database because they used clustered indexes.
I provided more config details in an earlier email.
As for asking for development to game a benchmark, no one is asking for
benchmark specials. The question of enhancements in response to benchmark needs
is an age old question. We can get into that, but it's really a different
discussion. Let me just expose the flip side of it: are we willing to watch
people use other databases to run benchmarks but feel content that no features
were developed specifically in response to benchmark results?
I am trying to engage with the community. We can drown the mailing list with
details. So I decided to open the discussion with the high level points, and we
will give you all the details that you want as we move forward.
Thanks,
Reza
From: Samuel Gendler [mailto:[email protected]]
Sent: Thursday, July 05, 2012 12:46 PM
To: Reza Taheri
Cc: Robert Klemme; [email protected]
Subject: Re: [PERFORM] The need for clustered indexes to boost TPC-V performance
On Thu, Jul 5, 2012 at 12:13 PM, Reza Taheri
<[email protected]<mailto:[email protected]>> wrote:
Hi Robert,
Yes, the same concept. Oracle's IOT feature is used often with TPC benchmarks.
Reza, it would be very helpful if you were to provide the list with a lot more
information about your current software and hardware configuration before
coming to the conclusion that the only possible way forward is with a
significant architectural change to the db engine itself. Not only is it not
at all clear that you are extracting maximum performance from your current
hardware and software, but I doubt anyone is particularly interested in doing a
bunch of development purely to game a benchmark. There has been significant
discussion of the necessity and viability of the feature you are requesting in
the past, so you should probably start where those discussions left off rather
than starting the discussion all over again from the beginning. Of course, if
vmware were to sponsor development of the feature in question, it probably
wouldn't require nearly as much buy-in from the wider community.
Getting back to the current performance issues - I have little doubt that the
MS SQL benchmark was set up and run by people who were intimately familiar with
MS SQL performance tuning. You stated in your earlier email that your team
doesn't have significant postgresql-specific experience, so it isn't
necessarily surprising that your first attempt at tuning didn't get the results
that you are looking for. You stated that you have 14 SSDs and 90 spinning
drives, but you don't specify how they are combined and how the database is
laid out on top of them. There is no mention of how much memory is available
to the system. We don't know how you've configured postgresql's memory
allocation or how your config weights the relative costs of index lookups,
sequential scans, etc. The guidelines for this mailing list include
instructions for what information should be provided when asking about
performance improvements. http://archives.postgresql.org/pgsql-performance/
Let's start by ascertaining how your benchmark results can be improved without
engaging in a significant development effort on the db engine itself.