he works alright in some scenarios,
but then suddenly doesn't work so well, even though not very much has
changed. Behavior like that makes the problems difficult to analyze,
and easy to miss. I'm suspicious of that.
--
Peter Geoghegan
neral sanity check.
The important call to avoid with page-level freezing is the xmin call to
TransactionIdDidCommit(), not the xmax call. The xmax call only occurs
when VACUUM prepares to freeze a tuple that was updated by an updater
(not a locker) that aborted. While the xmin calls will now take place with most
unfrozen tuples.
--
Peter Geoghegan
when lazy_scan_prune() processed the same page
during VACUUM's initial heap pass.
--
Peter Geoghegan
% of all relevant calls to TransactionIdDidCommit(), for
both the freeze_xmin and the freeze_xmax callsites.
--
Peter Geoghegan
v1-0001-Avoid-unnecessary-clog-lookups-when-freezing.patch
Description: Binary data
n is wrong, and simply needs to be removed.
Thanks for the report
--
Peter Geoghegan
least in performance sensitive code.
--
Peter Geoghegan
the
> variable.
Andres was the one that suggested this name, actually. I initially
just called it "freeze", but I think that Andres had it right.
> I think 0001+0002 are about ready.
Great. I plan on committing 0001 in the next few days. Committing 0002
might take a bit longer.
Thanks
--
Peter Geoghegan
On Tue, Dec 20, 2022 at 7:15 PM Peter Geoghegan wrote:
> On Tue, Dec 20, 2022 at 5:44 PM Jeff Davis wrote:
> > Next, the 'freeze_required' field suggests that it's more involved in
> > the control flow that causes freezing than it actually is. All it does
> > is communicate
radically new logic, but it would be good to try
> to catch minor state-handling errors.
Lots of stuff with contrib/amcheck, which, as you must already know,
will notice when an XID/MXID is contained in a table whose
relfrozenxid and/or relminmxid indicates that it shouldn't be there.
(Though VACUUM itself does the same thing, albeit not as effectively.)
Obviously the invariants haven't changed here. In many ways it's a
very small set of changes. But in one or two ways it's a significant
shift. It depends on how you think about it.
--
Peter Geoghegan
ay) 5 regular autovacuums would
have taken, and if you really needed those 5 separate autovacuums to
run, just to deal with the bloat, then that's a real problem. The
aggressive AV effectively causes bloat with such a workload.
--
Peter Geoghegan
2-Wz=MGFwJEpEjVzXwEjY5yx=uunpza6bt4dsmasrgluq...@mail.gmail.com
[3]
https://postgr.es/m/cah2-wznrzc-ohkb+qzqs65o+8_jtj6rxadjh+8ebqjrd1f8...@mail.gmail.com
[4]
https://towardsdatascience.com/the-inspection-paradox-is-everywhere-2ef1c2e9d709
[5]
https://wiki.postgresql.org/wiki/Freezing/skipping_strategies_patch:_motivating_examples#Scanned_pages.2C_visibility_map_snapshot
--
Peter Geoghegan
> builds on related added
Fixed.
Thanks
--
Peter Geoghegan
any one factor on its own [1].
[1]
https://wiki.postgresql.org/wiki/Freezing/skipping_strategies_patch:_motivating_examples#Opportunistically_advancing_relfrozenxid_with_bursty.2C_real-world_workloads
--
Peter Geoghegan
On Tue, Dec 13, 2022 at 9:16 AM Peter Geoghegan wrote:
> That's not the only thing we care about, though. And to the extent we
> care about it, we mostly care about the consequences of either
> freezing or not freezing eagerly. Concentration of unfrozen pages in
> one particular ta
n
about which kinds of mispredictions we cannot afford to make, and
which kinds are okay. Some mistakes hurt a lot more than others.
--
Peter Geoghegan
ints at whether or not laziness
is a good idea. It's a good idea whenever laziness has a decent chance
of avoiding completely unnecessary work altogether, provided we can
afford to be wrong about that without having to pay too high a cost
later on, when we have to course correct. What this mostly boils down
to is this: lazy freezing is generally a good idea in small tables
only.
--
Peter Geoghegan
arches.
I don't think that the L paper is particularly clear, or
particularly well written. It needs to be interpreted in its original
context, which is quite far removed from the current concerns of
nbtree. It's a 41 year old paper.
--
Peter Geoghegan
ng at any one time. That makes it significantly different to L
as well as nbtree (nbtree is far closer to L than it is to Lanin &
Shasha).
--
Peter Geoghegan
: note: pointer points here
> 01 00 00 00 00 00 00 00 d0 02 00 00 00 00 00 00 d0 02 00 00 00 00 00 00
> 01 00 00 00 00 00 00 00
I bet that this alignment issue can be fixed by using PGAlignedBlock
instead of a raw char buffer for a page. (I'm guessing, I haven't
directly checked.)
--
Peter Geoghegan
zing than other VACUUMs that ran against the same
table in the recent past, of course, so it is important to avoid
accidentally allowing any behavior that looks kind of like the ghost
of aggressive VACUUM.
--
Peter Geoghegan
On Wed, Nov 30, 2022 at 8:13 AM Robert Haas wrote:
> I haven't checked the patches to see whether they look correct, and
> I'm concerned in particular about upgrade scenarios. But if there's a
> way we can get that part committed, I think it would be a clear win.
+1
--
Peter Geoghegan
On Mon, Nov 28, 2022 at 1:52 PM Peter Geoghegan wrote:
> I'm not claiming to know how to build this "better xidStopLimit
> mechanism", by the way. I'm not seriously proposing it. Mostly I'm
> just saying that the question "where do you draw the line if not at 2
&g
'll be able to get rid of freezing in a few years time.
But as long as we have freezing, we have these problems.
--
Peter Geoghegan
afe than it
does with xidStopLimit. Both are removed by the patch.
--
Peter Geoghegan
t always more complicated than "not
enough available XID space".
--
Peter Geoghegan
make them antiwraparound autovacuums). They could almost (though not
quite) now be explained as "an autovacuum that takes place because
it's been a while since we did an autovacuum to deal with bloat and/or
tuple inserts". That will at least be reasonable if you assume all of
the patches get in.
--
Peter Geoghegan
of something being amiss in a way that it just isn't
at the moment. The docs can be updated to talk about antiwraparound
autovacuum as a thing that you should encounter approximately never.
This is possible even though the patch isn't invasive at all.
--
Peter Geoghegan
On Sat, Aug 6, 2022 at 1:03 PM Peter Geoghegan wrote:
> Attached patch teaches autovacuum.c to pass the information down to
> lazyvacuum.c, which includes the information in the autovacuum log.
> The approach I've taken is very similar to the existing approach with
> anti-wraparoun
On Mon, Oct 24, 2022 at 9:00 AM Peter Geoghegan wrote:
> Maybe it could be broken out into a separate "autovacuum triggered by:
> " line, seen only in the autovacuum log instrumentation (and not in
> the similar report output by a manual VACUUM VERBOSE). When we still
&
ves a clearer picture about the conditions in the
table. The relationship between SLRU space and physical heap pages and
the work of freezing is made somewhat clearer by a more proactive
approach to advancing relfrozenxid. That's one way that VACUUM can
lower the uncertainty I referred to.
--
Peter Geoghegan
quot; line into a defensive
"can't happen" ereport(). It couldn't hurt, at least -- we already
have a similar relfrozenxid check nearby, added after the "freeze the
dead" bug was fixed.
--
Peter Geoghegan
this path should just test !TransactionIdIsValid(xid)?
Agreed. Plus there should be a comment that reminds you that this is a
normal regular transaction ID (easy to miss, because the initial "if"
block for Multis is rather large).
I will push something like that soon.
--
Peter Geoghegan
y thing changed by using
DISABLE_PAGE_SKIPPING on top of FREEZE) actually matter to these
tests?
--
Peter Geoghegan
max. Namely: the only special
transaction ID that can ever appear in xmax is InvalidTransactionId.
(Also, it's not okay to see *any* other XID in the
"xmax_already_frozen = true" path, nor would it be okay to leave any
other XID behind in xmax in the nearby "freeze_xmax = true" path.)
--
Peter Geoghegan
caller.
The signature of vacuum_set_xid_limits() is probably going to gain
additional parameters before too long. In any case this seems like a
clear improvement.
--
Peter Geoghegan
v1-0001-Simplify-vacuum_set_xid_limits-signature.patch
Description: Binary data
On Thu, Nov 17, 2022 at 9:02 AM Peter Geoghegan wrote:
> Plan is to commit this later on today, barring objections.
Pushed, thanks.
--
Peter Geoghegan
On Wed, Nov 16, 2022 at 4:34 PM Peter Geoghegan wrote:
> WFM.
Attached is v3.
Plan is to commit this later on today, barring objections.
Thanks
--
Peter Geoghegan
v3-0001-Standardize-rmgrdesc-recovery-conflict-XID-output.patch
Description: Binary data
s sense to call this out, but I'd
> s/snapshotConflictHorizon format XIDs/cutoff with snapshotConflictHorizon
> semantics/
>
> or such?
WFM.
--
Peter Geoghegan
h calling this out directly in these comments IMV. We're
addressing two closely related things that assign opposite meanings to
InvalidTransactionId, which is rather confusing.
--
Peter Geoghegan
On Tue, Nov 15, 2022 at 8:48 PM Peter Geoghegan wrote:
> Okay, let's go with snapshotConflictHorizon. I'll use that name in the
> next revision, which I should be able to post tomorrow.
Attached is a somewhat cleaned up version that uses that symbol name
for everything.
--
Peter Geoghega
morrow.
--
Peter Geoghegan
o do
cleanup for aborted transactions, per the optimization in
HeapTupleHeaderAdvanceLatestRemovedXid().
Perhaps something like "mustBeCommittedCutoff" would work better? What
do you think of that? The emphasis on how things need to work on the
REDO side seems useful.
--
Peter Geoghegan
On Fri, Nov 11, 2022 at 10:38 AM Peter Geoghegan wrote:
> Attached is v4, which removes the old comments you pointed out were
> now out of place (they weren't adding much anyway). Also fixed bitrot
> against HEAD from today's visibility map commit from Jeff Davis.
Pushed somet
to avoid similar mistakes in the
future.
I'm not necessarily that attached to the name latestCommittedXid. It
is more accurate, but it's also a little bit too similar to another
common XID symbol name, latestCompletedXid. Can anyone suggest an
alternative?
--
Peter Geoghegan
v1-0001-Standardize
On Mon, Nov 14, 2022 at 2:58 PM Andres Freund wrote:
> On 2022-11-14 14:42:16 -0800, Peter Geoghegan wrote:
> > What does this have to tell us, if anything, about the implications
> > for code on HEAD?
>
> Nothing really test I sent (*) - I wanted to advance the discussion ab
nything, about the implications
for code on HEAD? I don't see any connection between this problem and
the possibility of a live bug on HEAD involving freezing later tuple
versions in a HOT chain, leaving earlier non-frozen versions behind to
break HOT chain traversal code. Should I have noticed such a
connection?
--
Peter Geoghegan
re we we'd best put a comment with a warning
> about this fact.
We already have comments discussing the relationship between
OldestXmin and vistest (as well as rel_pages) in heap_vacuum_rel().
That seems like the obvious place to put something like this, at least
to me.
--
Peter Geoghegan
I agree with Peter. We have to try to get that case right. If we can
> eventually eliminate it as a valid case by some mechanism, hooray. But
> in the meantime we have to deal with it as best we can.
Practiced intellectual humility seems like the way to go here. On some
level I suspect that we'll h
sequence requires 3 sessions, not just 2.
--
Peter Geoghegan
mpanying WAL record, or an atomic action with a
WAL record whose REDO routine needs to reliably reproduce the same
on-disk state as original execution (barring preexisting differences
in how hint bits are set between original execution and a hot
standby).
--
Peter Geoghegan
eal with the recovery conflict issues created by my big
page-level freezing/freezing strategy patch set in the patch set
itself.
Will commit this early next week barring any objections.
Thanks
--
Peter Geoghegan
v4-0001-Shrink-freeze-WAL-records-via-deduplication.patch
Description: Binary data
On Tue, Sep 20, 2022 at 3:12 PM Peter Geoghegan wrote:
> Attached is v2, which I'm just posting to keep CFTester happy. No real
> changes here.
Attached is v3. I'd like to move forward with commit soon. I'll do so
in the next few days, barring objections.
v3 has vacuumlazy.
al,
> >and feels like something worth fixing. The fundamental unit of work in
> >lazy_scan_heap() is a *scanned* heap page.
>
> It makes perfect sense to use the scanned_pages instead.
Want to have a go at writing a patch for that?
--
Peter Geoghegan
" was limited to updates that look like
that. So I don't know what you mean about never allowing the horizon
to be raised.
--
Peter Geoghegan
o the correct chain member.
> */
> if (i >= nchain)
> heap_prune_record_dead(prstate, rootoffnum);
> else
> heap_prune_record_redirect(prstate, rootoffnum,
> chainitems[i]);
I don't see why this code is relevant.
--
Peter Geoghegan
, and never has been. (The term "freeze xmax" is
a bit ambiguous, though it usually means set xmax to
InvalidTransactionId.)
3. There is no specific reason to believe that there is a live bug here.
Putting all 3 together: doesn't it seem quite likely that the way that
we compute OldestXmin is the factor that prevents "skewering" of an
update chain? What else could possibly be preventing corruption here?
(Theoretically it might never have been discovered, but that seems
pretty hard to believe.)
--
Peter Geoghegan
er, I don't think it's possible for
OldestXmin/FreezeLimit to take on a value like that (i.e. a value that
"skewers" the update chain like this, the value 7 from your example).
We ought to be able to rely on an OldestXmin value that can never let
such a situation emerge. Right?
--
Peter Geoghegan
ke something worth fixing. The fundamental unit of work in
lazy_scan_heap() is a *scanned* heap page.
--
Peter Geoghegan
oking concept.
[1]
https://postgr.es/m/CAH2-WzkQ86yf==mgAF=cq0qelrwkx3htlw9qo+qx3zbwjjk...@mail.gmail.com
--
Peter Geoghegan
a degree that satisfies autovacuum.c. It will launch antiwraparound
autovacuums again and again, never realizing that VACUUM doesn't
really care about what it expects (at least not with the reloption in
use). Clearly that's just broken. It also suggests a more general
design problem, at least in my mind.
--
Peter Geoghegan
d forth
conversation between autovacuum.c and vacuumlazy.c that makes sure
that something useful happens sooner or later. The passage of time
really matters here.
As a bonus, we might be able to get rid of the autovacuum GUC
variants. Plus the current autovacuum logging would just be how we'd
log every VACUUM.
--
Peter Geoghegan
able to avoid throwing an error
if we do that. But isn't that already true with the patch that we
have? Is it really a "true VACUUM" if the operation can never advance
datfrozenxid? At least selectively demoting the ERROR to a WARNING is
"transparent" about it.
--
Peter Geoghegan
might want to do it despite the
fact that there are relatively few LP_DEAD items in heap pages.
I don't think that we need to worry too much about how VACUUM itself
might apply the same information for now, but it's something that you
might want to consider.
--
Peter Geoghegan
t; accomplish. In other words, a before/after of the interesting cases.
That's on my TODO list. Mostly it's an independent thing to this
(antiwraparound) autovacuum stuff, despite the fact that both projects
share the same underlying philosophy.
--
Peter Geoghegan
On Mon, Oct 24, 2022 at 7:56 AM Peter Geoghegan wrote:
> I don't understand what you mean. FreezeLimit *isn't* always exactly
> 50 million XIDs before OldestXmin -- not anymore. In fact that's the
> main benefit of this work (commit c3ffa731). That detail has changed
> (and changed fo
destXmin -- it would be set to OldestXmin directly when the
WARNING was given. But now we get smoother behavior, without any big
discontinuities in how FreezeLimit is set over time when OldestXmin is
held back generally.
--
Peter Geoghegan
uum in our autovacuum log
report (actually we'd still report autovacuums aš antiwraparound
autovacuum in cases where that still happened, which won't be
presented as just another triggering criteria in the report).
[1]
https://www.postgresql.org/message-id/flat/cah2-wzneqmkmry8feudk8xdph37-4anygf7a04bwxoc1gkd...@mail.gmail.com
--
Peter Geoghegan
On Thu, Oct 20, 2022 at 11:52 AM Peter Geoghegan wrote:
> Leaving my patch series aside, I still don't think that it makes sense
> to make it impossible to auto-cancel antiwraparound autovacuums,
> across the board, regardless of the underlying table age.
One small thought on the pre
chance to resolve the problem? Why take a risk of
causing much bigger problems (e.g., blocking automated DDL that blocks
simple SELECT queries) before the point that that starts to look like
the lesser risk (compared to hitting xidStopLimit)?
--
Peter Geoghegan
log/manta-postmortem-7-27-2015
[2]
https://postgr.es/m/CAH2-WzkU42GzrsHhL2BiC1QMhaVGmVdb5HR0_qczz0Gu2aSn=a...@mail.gmail.com
--
Peter Geoghegan
ould also expect a similar focus on the picture over time to be
useful with the indexing stuff, for roughly the same underlying
reasons.
[1]
https://postgr.es/m/CAA8Fd-pB=mr42YQuoaLPO_o2=xo9yjnjq23cyjdfwc8sxgm...@mail.gmail.com
--
Peter Geoghegan
o show the WARNING
in most cases where freeze_min_age has been completely ignored (it
must be ignored in extreme cases because FreezeLimit must always be <=
OldestXmin). Also, the proposed new WARNING is only seen when we're
bound to also see the existing OldestXmin WARNING already. Why have 2
WARNINGs a
It's easy to see why this
code is broken, but to me it seems related to having too much
confidence in what is possible while relying on cleanup locks. That's
just my take.
--
Peter Geoghegan
omfortably close to "a buffer lock, but somehow also not a buffer
lock".
--
Peter Geoghegan
On Mon, Oct 10, 2022 at 4:46 PM Peter Geoghegan wrote:
> There is no reason to think that the user will also (say) set the
> autovacuum_freeze_table_age reloption separately (not to be confused
> with the vacuum_freeze_table_age GUC!). We'll usually just work off
> the GUC (I mean why
whole way that vacuum_freeze_table_age and
autovacuum_freeze_max_age are supposed to work together seems very
confusing to me. I'm not surprised that this was overlooked for so long.
--
Peter Geoghegan
needed to be
written, so there is actually no reason to care about dirtying the
page. No?
I'm in favor of reducing the number of WAL records required in common
cases if at all possible -- purely because the generic WAL record
overhead of having an extra WAL record does probably add to the WAL
overh
On Tue, Oct 4, 2022 at 10:39 AM Jeff Davis wrote:
> On Mon, 2022-10-03 at 22:45 -0700, Peter Geoghegan wrote:
> > Once a table becomes larger than vacuum_freeze_strategy_threshold,
> > VACUUM stops marking pages all-visible in the first place,
> > consistently marking th
he high level concepts.
So we avoid big spikes, and try to do the work when it's cheapest.
--
Peter Geoghegan
ust to reduce the deferred work, without worrying about advancing
> relfrozenxid.
True. Though I think that a strong bias in the direction of advancing
relfrozenxid by some amount (not necessarily by very many XIDs) still
makes sense, especially when we're already freezing aggressively.
--
Peter Geoghegan
.
To me this dynamic seems qualitatively different to other cases, where
we might want to give some weight to uncertainty. Understanding where
the boundaries lie between those trickier cases and this simpler case
seems important and relevant to me.
--
Peter Geoghegan
On Sun, Oct 2, 2022 at 3:43 AM Robert Haas wrote:
> On Fri, Sep 30, 2022 at 2:24 PM Peter Geoghegan wrote:
> > It's not just that the risks are ludicrously high, of course. The
> > potential benefits must *also* be very low. It's both factors,
> > together.
>
> Hmm,
cs that
were challenging to the implementation and worth specifically getting
right. For example, "saw tooth" input.
--
Peter Geoghegan
g. More generally, thinking about how things work across
multiple layers of abstraction seems like it could be valuable in
other ways.
[1]
https://postgr.es/m/CAH2-WzmLCn2Hx9tQLdmdb+9CkHKLyWD2bsz=pmrebc4daxj...@mail.gmail.com
--
Peter Geoghegan
bogosort could ever be better than quicksort. Having
fewer choices by just not offering inherently bad choices seems quite
unrelated to what you're talking about.
For all I know you might be onto something. But it really seems
independent to me.
--
Peter Geoghegan
is perhaps a theoretical sense in which
that isn't quite true, but it's true for all practical purposes, which
should be enough.
--
Peter Geoghegan
uently. But like that list of two is
> pretty much the whole list. I think we've talked ourselves into
> believing that this problem is much harder than it really is.
+1
--
Peter Geoghegan
s related to TRUST_STRXFRM:
https://postgr.es/m/cah2-wzmqrjqv9pgyzebgnqmcac1ct+uxg3vqu7ksvundf_y...@mail.gmail.com
--
Peter Geoghegan
ight. Though I am actually sympathetic to the idea that users might
gladly pay a cost for performance stability -- even a fairly large
cost. That part doesn't seem like the problem.
--
Peter Geoghegan
ult is
quite another -- that seems far harder.
--
Peter Geoghegan
don't. We can perhaps be
approximately 100% sure that something like that will be true in all
cases, no matter the details. That seems like a very different concept
to what you've proposed.
--
Peter Geoghegan
performance cost).
I thought that that was unjustified myself.
--
Peter Geoghegan
On Wed, Sep 28, 2022 at 9:59 PM David Rowley wrote:
> select b from t1 order by b offset 100;
>
> Master:
> latency average = 344.763 ms
>
> Patched:
> latency average = 268.374 ms
>
> about 28% faster.
That's more like it!
--
Peter Geoghegan
On Wed, Sep 28, 2022 at 6:13 PM David Rowley wrote:
> Master:
> latency average = 313.197 ms
>
> Patched:
> latency average = 304.335 ms
>
> So not a very impressive speedup there (about 3%)
Worth a try, at least. Having a more consistent interface is valuable
in itself too.
--
Peter Geoghegan
with something like a varbyte encoding, and maybe
the complexity it imposes just won't be worth it -- I really have no
opinion on that just yet.
--
Peter Geoghegan
On Wed, Sep 28, 2022 at 4:00 PM Peter Geoghegan wrote:
> I am reminded of the discussion that led to bugfix commit c2d4eb1b
> some years back.
Also potentially relevant: the 2017 commit fa117ee4 anticipated adding
a "copy" argument to tuplesort_getdatum() (the same commit ad
calling
tuplesort_gettuple_common()). We used to sometimes do that with
tuplesort_getindextuple() and possible other such routines, but the
need for that capability was eliminated on the caller side around the
same time as the bugfix went in.
--
Peter Geoghegan
l PRUNE records, for whatever reason.
This is just a made up example, so the specifics might be off
significantly -- I'd have to work on it to be sure. Hopefully the
example still gets the general idea across.
--
Peter Geoghegan
ctually check both!
--
Peter Geoghegan
dered individually).
--
Peter Geoghegan
601 - 700 of 3084 matches
Mail list logo