On Thu, Jan 16, 2014 at 3:23 PM, Dave Chinner <da...@fromorbit.com> wrote:
> On Wed, Jan 15, 2014 at 06:14:18PM -0600, Jim Nasby wrote:
> > On 1/15/14, 12:00 AM, Claudio Freire wrote:
> > >My completely unproven theory is that swapping is overwhelmed by
> > >near-misses. Ie: a process touches a page, and before it's
> > >actually swapped in, another process touches it too, blocking on
> > >the other process' read. But the second process doesn't account
> > >for that page when evaluating predictive models (ie: read-ahead),
> > >so the next I/O by process 2 is unexpected to the kernel. Then
> > >the same with 1. Etc... In essence, swap, by a fluke of its
> > >implementation, fails utterly to predict the I/O pattern, and
> > >results in far sub-optimal reads.
> > >
> > >Explicit I/O is free from that effect, all read calls are
> > >accountable, and that makes a difference.
> > >
> > >Maybe, if the kernel could be fixed in that respect, you could
> > >consider mmap'd files as a suitable form of temporary storage.
> > >But that would depend on the success and availability of such a
> > >fix/patch.
> > Another option is to consider some of the more "radical" ideas in
> > this thread, but only for temporary data. Our write sequencing and
> > other needs are far less stringent for this stuff. -- Jim C.
> I suspect that a lot of the temporary data issues can be solved by
> using tmpfs for temporary files....
Temp files can collectively reach hundreds of gigs. So I would have to set
up two temporary tablespaces, one in tmpfs and one in regular storage, and
then remember to choose between them based on my estimate of how much temp
space is going to be used in each connection (and hope I don't mess up the
estimation and so either get errors, or render the server unresponsive).
So I just use regular storage, and pay the "insurance premium" of having
some extraneous write IO. It would be nice if the insurance premium were
cheaper, though. I think the IO storms during checkpoint syncs are
definitely the more critical issue, this is just something nice to have
which seemed to align with one the comments.