On 3/18/12 10:25 AM, Tom Lane wrote:
Jeff Janes<jeff.ja...@gmail.com> writes:
> On Wed, Mar 7, 2012 at 11:55 AM, Robert Haas<robertmh...@gmail.com> wrote:
>> On Sat, Mar 3, 2012 at 4:15 PM, Jeff Janes<jeff.ja...@gmail.com> wrote:
>>> Anyway, I think the logtape could use redoing.
> The problem there is that none of the files can be deleted until it
> was entirely read, so you end up with all the data on disk twice. I
> don't know how often people run their databases so close to the edge
> on disk space that this matters, but someone felt that that extra
> storage was worth avoiding.
Yeah, that was me, and it came out of actual user complaints ten or more
years back. (It's actually not 2X growth but more like 4X growth
according to the comments in logtape.c, though I no longer remember the
exact reasons why.) We knew when we put in the logtape logic that we
were trading off speed for space, and we accepted that. It's possible
that with the growth of hard drive sizes, real-world applications would
no longer care that much about whether the space required to sort is 4X
data size rather than 1X. Or then again, maybe their data has grown
just as fast and they still care.
I believe the case of tape sorts that fit entirely in filesystem cache is a big one as
well... doubling or worse the amount of data that needed to live "on disk" at
once would likely suck in that case.
Also, it's not uncommon to be IO-bound on a database server... so even if we're
not worried about storing everything 2 or more times from a disk space
standpoint, we should be concerned about the IO bandwidth.
--
Jim C. Nasby, Database Architect j...@nasby.net
512.569.9461 (cell) http://jim.nasby.net
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers