Dave Chinner <da...@fromorbit.com> writes:
> On Wed, Jan 15, 2014 at 07:08:18PM -0500, Tom Lane wrote:
>> No, we'd be happy to re-request it during each checkpoint cycle, as
>> long as that wasn't an unduly expensive call to make. I'm not quite
>> sure where such requests ought to "live" though. One idea is to tie
>> them to file descriptors; but the data to be written might be spread
>> across more files than we really want to keep open at one time.
> It would be a property of the inode, as that is how writeback is
> tracked and timed. Set and queried through a file descriptor,
> though - it's basically the same context that fadvise works
Ah, got it. That would be fine on our end, I think.
>> We could probably live with serially checkpointing data
>> in sets of however-many-files-we-can-have-open, if file descriptors are
>> the place to keep the requests.
> Inodes live longer than file descriptors, but there's no guarantee
> that they live from one fd context to another. Hence my question
> about persistence ;)
I plead ignorance about what an "fd context" is. However, if what you're
saying is that there's a small chance of the kernel forgetting the request
during normal system operation, I think we could probably tolerate that,
if the API is designed so that we ultimately do an fsync on the file
anyway. The point of the hint would be to try to ensure that the later
fsync had little to do. If sometimes it didn't work, well, that's life.
We're ahead of the game as long as it usually works.
regards, tom lane
Sent via pgsql-hackers mailing list (firstname.lastname@example.org)
To make changes to your subscription: