On Tue, Mar 26, 2013 at 7:30 AM, Simon Riggs <si...@2ndquadrant.com> wrote:
> On 26 March 2013 11:33, Robert Haas <robertmh...@gmail.com> wrote:
>> On Tue, Mar 26, 2013 at 5:27 AM, Simon Riggs <si...@2ndquadrant.com> wrote:
>>> On 26 March 2013 01:35, Greg Stark <st...@mit.edu> wrote:
>>>> On Tue, Mar 26, 2013 at 12:00 AM, Simon Riggs <si...@2ndquadrant.com> 
>>>> wrote:
>>>>> I'll bet you all a beer at PgCon 2014 that this remains unresolved at
>>>>> that point.
>>>>
>>>> Are you saying you're only interested in working on it now? That after
>>>> 9.3 is release you won't be interested in working on it any more?
>>>>
>>>> As you said we've been eyeing this particular logic since 2004, why
>>>> did it suddenly become more urgent now? Why didn't you work on it 9
>>>> months ago at the beginning of the release cycle?
>>>
>>> I'm not sure why your comments are so confrontational here, but I
>>> don't think it helps much. I'm happy to buy you a beer too.
>>>
>>> As I explained clearly in my first post, this idea came about trying
>>> to improve on the negative aspects of the checksum patch. People were
>>> working on ideas 9 months ago to resolve this, but they have come to
>>> nothing. I regret that; Merlin and others have worked hard to find a
>>> way: Respect to them.
>>>
>>> My suggestion is to implement a feature that takes 1 day to write and
>>> needs little testing to show it works.
>>
>> Any patch in this area isn't likely to take much testing to establish
>> whether it improves some particular case.  The problem is what happens
>> to all of the other cases - and I don't believe that part needs little
>> testing, hence the objections (with which I agree) to doing anything
>> about this now.
>>
>> If we want to change something in this area, we might consider
>> resurrecting the patch I worked on for this last year, which had, I
>> believe, a fairly similar mechanism of operation to what you're
>> proposing, and some other nice properties as well:
>>
>> http://www.postgresql.org/message-id/aanlktik5qzr8wts0mqcwwmnp-qhgrdky5av5aob7w...@mail.gmail.com
>> http://www.postgresql.org/message-id/aanlktimgkag7wdu-x77gnv2gh6_qo5ss1u5b6q1ms...@mail.gmail.com
>>
>> ...but I think the main reason why that never went anywhere is because
>> we never really had any confidence that the upsides were worth the
>> downsides.  Fundamentally, postponing hint bit setting (or hint bit
>> I/O) increases the total amount of work done by the system.  You still
>> end up writing the hint bits eventually, and in the meantime you do
>> more CLOG lookups.  Now, as a compensating benefit, you can spread the
>> work of writing the hint-bit updated pages out over a longer period of
>> time, so that no single query carries too much of the burden of
>> getting the bits set.  The worst-case-latency vs. aggregate-throughput
>> tradeoff is one with a long history and I think it's appropriate to
>> view this problem through that lens also.
>
> I hadn't realised so many patches existed that were similar. Hackers
> is bigger these days.
>
> Reviewing the patch, I'd say the problem is that it is basically
> implementing a new automatic heuristic. We simply don't have any
> evidence that any new heuristic will work for all cases, so we do
> nothing.
>
> Whether we apply my patch, yours or Merlin's, my main thought now is
> that we need a user parameter to control it so it can be adjusted
> according to need and not touched at all if there is no problem.

After a night thinking about this, I'd like to make some points:

*) my patch deliberately did not 'set bits without dirty' -- with
checksums in mind as you noted (thanks for that).  I think the upside
for marking pages in that fasion anyways is overrated.

*) Any strategy that does not approximate hint bit behavior IMNSHO is
a non-starter.  By that I mean when your $condition is met so that
hint bits are not being written out, scans need to bail out of
HeapTupleSatisfiesMVCC processing with a cheap check.  If you don't do
that and rely on the transam.c guard, you've already missed the boat:
the even without clog lookup the extra processing there I can assure
you will show up in profiling of repeated scans (until vacuum).

*) The case of sequential tuples with the same xid is far and away the
most important one.  In OLTP workloads hint bit i/o is minor compared
to everything else going on.  Also, OLTP workloads are probably better
handled with an hint bit check just before eviction via bgwriter vs
during scan.

*) The budget for extra work inside HeapTupleSatisfiesMVCC is
exceptionally low.  For this reason, I think your idea would be better
framed at the page level and the bailout should be measured in the
number of pages, not tuples (that way the page can send in a single
boolean to control hint bit behavior).

*) The upside of optimizing xmax processing is fairly low for most
workloads I've seen

*) The benchmarking Amit and Hari did needs analysis.

*) For off-cycle release work that would help enable patches with
complex performance trade-offs (I'm working up another patch that has
even more compelling benefits and risks in the buffer allocator),  We
desperately need a standard battery of comprehensive performance tests
and doner machines.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to