would help, since llvm-gcov requires it,
but that doesn't seem particularly likely.
It's definitely generally recommended that "-O0" be used, so I think
that we can agree that that was an improvement, even if it doesn't fix
the remaining problem that I noticed when I rechecked nbtuti
On Fri, Apr 12, 2019 at 10:49 AM Peter Geoghegan wrote:
> It's definitely generally recommended that "-O0" be used, so I think
> that we can agree that that was an improvement, even if it doesn't fix
> the remaining problem that I noticed when I rechecked nbtutils.c.
I'm n
be difficult to
impossible as a practical matter.
Unfortunately, you're both right. I don't know where that leaves us.
--
Peter Geoghegan
few hundred
contiguous logical values, you can justify aggressively compressing
the representation in the B-Tree entries. Compression would still be
based on prefix compression, but the representation itself can be
specialized.
--
Peter Geoghegan
e project begins to mature. I have little faith in our
ability to predict which approach will be the least painful at this
early stage.
--
Peter Geoghegan
been trying to get you to get on
> board with allowing different leaf-level "item pointer equivalents"
> widths inside nbtree...
Getting me to agree that that would be nice and getting me to do the
work are two very different things. ;-)
--
Peter Geoghegan
On Tue, Apr 16, 2019 at 12:00 PM Peter Geoghegan wrote:
> Can you be more specific? What was the cause of the corruption? I'm
> always very interested in hearing about cases that amcheck could have
> detected, but didn't.
FWIW, v4 indexes in Postgres 12 will support the new &qu
ot;
at P_FIRSTDATAKEY() within internal pages instead. That would be
useful for other things anyway (e.g. prefix compression).
--
Peter Geoghegan
, rather than counting tuples per se.
I like that idea, but I'm pretty sure that there are very few users
that are aware of these distinctions at all. It would be a good idea
to clearly document them.
--
Peter Geoghegan
ate an open item for this, and begin work on a patch tomorrow.
--
Peter Geoghegan
me the per-datum overhead seems very high to me. Maybe
prefix compression could help here, which a low key and high key can
do rather well.
--
Peter Geoghegan
least like
to know the settings used by its builds.
--
Peter Geoghegan
efer to
have optimizations enabled if I was optimizing my code, but that's not
what the web resource is for, really.
Thanks
--
Peter Geoghegan
't have time to confirm all this right now, but I am pretty
confident that they have both problems. And almost as confident that
they'd notice substantial benefits from this patch series.
--
Peter Geoghegan
On Mon, May 13, 2019 at 9:09 PM Peter Geoghegan wrote:
> Even when that happens, the index is already considered corrupt by
> VACUUM, so the same VACUUM process that could in theory be adversely
> affected by removing the half-dead internal page check will complain
> about the page
On Thu, May 16, 2019 at 1:05 PM Peter Geoghegan wrote:
> Actually, now that I look back at how page deletion worked 5+ years
> ago, I realize that I have this slightly wrong: the leaf level check
> is not sufficient to figure out if the parent's right sibling is
> pending del
Perhaps some hack with effects confined to
> pg_upgrade's test would be better. I don't have a good idea what
> that would look like, however.
>
> Or we could just say this isn't annoying enough to fix.
I think it's worth fixing.
--
Peter Geoghegan
hen I
use the optimum number of jobs, so I don't consider it to be urgent.
I'm happy with the new approach, since it avoids the problem of
regression.diffs files that get deleted before I have a chance to take
a look. I should thank Noah for his work on this.
--
Peter Geoghegan
On Fri, May 24, 2019 at 12:31 PM Peter Geoghegan wrote:
>
> On Wed, May 22, 2019 at 3:57 PM Tom Lane wrote:
> > I experimented with the attached quick-hack patch to make pg_regress
> > suppress notices from its various initial DROP/CREATE IF [NOT] EXISTS
> > commands. I'
ng as we're not going to
those lengths, there may be some value in keeping the macaddr8 code in
line with macaddr code -- the two types are currently almost the same
(the glaring difference is the lack of macaddr8 sort support).
We'll need to draw the line somewhere, and that is likely to be a bit
arbitrary. This was what I meant by "weird".
--
Peter Geoghegan
8, and possibly
doing the same with the original macaddr type.
Do you think that you'll be able to work on the project with this
expanded scope, Melanie?
--
Peter Geoghegan
make sense, but I don't think that this
patch needs to do that. There is at least one pre-existing cases that
does this -- macaddr.
--
Peter Geoghegan
ian
execution time as representative seems to be the best approach.
--
Peter Geoghegan
memset() at the
equivalent point for macaddr8, since we cannot "run out of bytes from
the authoritative representation" that go in the Datum/abbreviated
key. I suppose that the memset() should simply be removed, since it is
superfluous here.
--
Peter Geoghegan
to pick a number at random?
It also looks like pg_proc.dat should be updated, since it still
mentions the old custom of trying to use contiguous OIDs. It also
discourages the practice of picking OIDs at random, which is almost
the opposite of what it should say.
--
Peter Geoghegan
is visible
to verification's heap scan.
The CREATE INDEX CONCURRENTLY bug that Pavan found a couple of years
back while testing the WARM patch is one example. A bug that was
fallout from the DROP INDEX CONCURRENTLY work is another historic
example. Alvaro will recall that this same check had a role in the
d description in the
EXPLAIN output for a bounded sort node, even though that approach
doesn't seem desirable in general.
--
Peter Geoghegan
> committed untested. I am surprised that we have not seen that
> complain yet.
Why is that surprising?
https://coverage.postgresql.org/src/backend/access/rmgrdesc/index.html
--
Peter Geoghegan
d have been tested when it went in, but I'm not surprised that
the bug remained undetected for a year. Not that many people use
pg_waldump.
--
Peter Geoghegan
d work could use some
specific non-goals. Perhaps I just haven't being paying enough
attention to have noticed them.
--
Peter Geoghegan
hit "next" a few times, until heap_update()'s "page"
variable is initialized.
--
Peter Geoghegan
gt; the page to the list by following the procedure described here:
> Hm, what are you hoping to glean by doing so?
Nothing in particular. I see no reason to assume that we know what
that looks like, though. It's easy to check.
--
Peter Geoghegan
eird, and needs to be explained. If it turns out that it isn't
actually called less often, then I would suggest that the speedup
might be related to memory fragmentation, which often matters a lot
within tuplesort.c. (This is why external sort merging now uses big
buffers, and double buffering.)
--
Peter Geoghegan
ts in tuplesort.c, so if memory
> fragmentation is a real concern this patch could definitely notice
> changes in that regard.
Sounds like it's probably fragmentation. That's generally hard to measure.
--
Peter Geoghegan
nts when various conditions are met), but in practice
the behavior isn't meaningfully different from blocking reads.
--
Peter Geoghegan
On Wed, Jun 12, 2019 at 1:16 PM Bruce Momjian wrote:
> First, my apologies in getting to this so late. Peter Geoghegan
> supplied me with slides and a video to study, and I now understand how
> complex the btree improvements are. Here is a video of Peter's
> presentat
it
in a way that doesn't run into buildfarm issues due to alignment. I
suggest an index on a text column to avoid problems.
--
Peter Geoghegan
at is set
up within _bt_first(). This hasn't been adopted to the patch at all,
so you'll probably need to do that.
The patch should be considered a very rough hack, for now. It leaks
memory like crazy. But I think that you'll find it helpful.
--
Peter Geoghegan
0012-Index-scan-positi
e these
details, even in their introductory material on how B-Tree indexes
work. The term "suffix truncation" isn't something users have heard
of, and we shouldn't use it here, but the *idea* of suffix truncation
is very well established. As I mentioned, it matters for things like
covering indexes (indexes that are designed to be used by index-only
scans, which are not necessarily INCLUDE indexes).
Thanks!
--
Peter Geoghegan
On Wed, Jun 12, 2019 at 7:29 PM Bruce Momjian wrote:
> OK, I mentioned something about increased locality now. Patch attached.
Looks good -- locality is a good catch-all term.
Thanks!
--
Peter Geoghegan
On Sat, Jun 15, 2019 at 2:11 PM Peter Geoghegan wrote:
> On Sat, Jun 15, 2019 at 1:39 PM Noah Misch wrote:
> > To me, this text implies a cautious DBA should amcheck every index. Reading
> > the thread[1], I no longer think that. It's enough to monitor that VACUUM
> >
On Sat, Jun 15, 2019 at 3:05 PM Tom Lane wrote:
> Thanks for the input, guys. What do you think of
>
> Avoid writing an invalid empty btree index page in the unlikely case
> that a failure occurs while processing INCLUDEd columns during a page
> split (
occur when we
allocate a temp page buffer. I verified that this causes no
significant issue for VACUUM. It is best avoided, since we still
"leak" the new page/buffer, although that is almost harmless.
--
Peter Geoghegan
On Wed, May 8, 2019 at 3:37 PM Peter Geoghegan wrote:
> It makes perfect sense for _bt_split() to call _bt_findsplitloc()
> directly, since _bt_findsplitloc() is already aware of almost every
> _bt_split() implementation detail, whereas those same details are not
> of interest anywh
gt; made possible by previous commits, or a "fix" for a previous commit.
Yes. It's a bug fix that went in after feature freeze.
--
Peter Geoghegan
he right call. I don't think that I
am alone in seeing it this way.
--
Peter Geoghegan
On Fri, May 10, 2019 at 6:02 PM Bruce Momjian wrote:
> Have new btree indexes sort duplicate index entries in heap-storage
> order (Peter Geoghegan)
>
> This slightly reduces the maximum-allowed length of in
On Sat, May 11, 2019 at 11:02 AM Bruce Momjian wrote:
> OK, commit removed.
You're mistaken -- nothing has been pushed to master in the last 3 hours.
--
Peter Geoghegan
On Sun, May 5, 2019 at 1:14 PM Peter Geoghegan wrote:
> Attached draft patch adjusts code comments and error messages where a
> line pointer is referred to as an item pointer. It turns out that this
> practice isn't all that prevalent. Here are some specific concerns
> that I had to
heap-only tuple),
>
> Should "that item" also be re-worded, for consistency?
Yes, it should be -- I'll fix it.
I'm going to backpatch the storage.sgml change on its own, while
pushing everything else in a separate HEAD-only commit.
Thanks
--
Peter Geoghegan
On Tue, May 7, 2019 at 9:59 AM Peter Geoghegan wrote:
> On Tue, May 7, 2019 at 12:27 AM Heikki Linnakangas wrote:
> > I don't understand that reasoning. Yes, _bt_pagedel() will complain if
> > it finds a half-dead internal page. But how does that mean that
> > _bt_lock_
On Mon, May 13, 2019 at 8:30 PM Tom Lane wrote:
> Peter Geoghegan writes:
> > To be fair, I suppose that the code made more sense when it first went
> > in, because at the time there was a chance that there could be
> > leftover half-dead internal pages. But, that wa
TIDs as a whole new type of tuple with its own set of
specialized functions in tuplesort.c, which has problems of its own,
then it's kind of awkward to do it some other way.
--
Peter Geoghegan
rruption. Admittedly, amcheck didn't
find any bugs in my code after the first couple of versions of the
patch series, so this approach seems unlikely to find any problems
now. Even still, it wouldn't be very difficult to do this extra step.
It seems worthwhile to be thorough here, given that we dep
ll
definitely do it that way if there is sufficient work_mem)?
--
Peter Geoghegan
g about what the expectations are for
> fd.c, and some rejiggering of PrepareTempTablespaces's API too.
> I'm not sufficiently excited about this issue to do that.
+1. Let's close this one out.
--
Peter Geoghegan
On Fri, May 17, 2019 at 6:36 PM Tom Lane wrote:
> Will do so tomorrow. Should we back-patch this?
I wouldn't, because I see no reason to. Somebody else might.
--
Peter Geoghegan
ffected, because tuples that are that wide are already compressed in
almost all cases -- it seems like it would be hard to be just at the
edge of the limit already.
Thanks
--
Peter Geoghegan
0001-Suggested-changes-to-v12-release-notes.patch
Description: Binary data
On Tue, May 7, 2019 at 6:15 PM Peter Geoghegan wrote:
> I suppose I'm biased, but I prefer the new approach anyway. Adding the
> left high key first, and then the right high key seems simpler and
> more logical. It emphasizes the similarities and differences between
> leftpage and r
ty" sentinel values) also contributes to
making indexes smaller.
The page split stuff was mostly added by commit fab250243 ("Consider
secondary factors during nbtree splits"), but commit f21668f32 ("Add
"split after new tuple" nbtree optimization") added to that in
d point under normal circumstances.
This does seem like an unfriendly behavior. Moving the thread over to
the -hackers list for further discussion...
--
Peter Geoghegan
On Mon, May 20, 2019 at 3:17 PM Andres Freund wrote:
>
>
>
> Improve speed of btree index insertions (Peter Geoghegan,
> Alexander Korotkov)
>
My concern here (which I believe Alexander shares) is that it doesn't
make sense to group these
On Tue, Apr 30, 2019 at 9:44 AM Andreas Joseph Krogh
wrote:
> I have a 1.4GB dump (only one table) which reliably reproduces this error.
> Shall I share it off-list?
I would be quite interested in this, too, since there is a chance that
it's my bug.
--
Peter Geoghegan
On Tue, Apr 30, 2019 at 11:54 AM Andreas Joseph Krogh
wrote:
> Nice, thanks!
Thanks for the report!
--
Peter Geoghegan
umps that were larger than that by providing a Google drive
link. Something like that should work reasonably well.
--
Peter Geoghegan
On Tue, Apr 30, 2019 at 9:56 AM Andreas Joseph Krogh wrote:
> I've sent you guys a link (Google Drive) off-list.
I'll start investigating the problem right away.
Thanks
--
Peter Geoghegan
On Tue, Apr 30, 2019 at 9:59 AM Peter Geoghegan wrote:
> I'll start investigating the problem right away.
I have found what the problem is. I simply neglected to make a
conservative assumption about suffix truncation needing to add a heap
TID to a leaf page's new high key in nbtsort.c (follow
s that
cannot be fixed by reordering something in application code).
--
Peter Geoghegan
On Sun, Apr 28, 2019 at 10:15 AM Alexander Korotkov
wrote:
> I think this definitely not bug fix. Bloom filter was designed to be
> lossy, no way blaming it for that :)
I will think about a simple fix, but after the upcoming point release.
There is no hurry.
--
Peter Geoghegan
On Tue, Apr 30, 2019 at 10:58 AM Peter Geoghegan wrote:j
> I have found what the problem is. I simply neglected to make a
> conservative assumption about suffix truncation needing to add a heap
> TID to a leaf page's new high key in nbtsort.c (following commit
> dd299df8189), even tho
r in your prototype, but it's not a great long term solution.
--
Peter Geoghegan
; only. This would involve removing
the comment in itemid.h that confusingly refers to line pointers as
"item pointers" (plus any other comments that fail to make a clear
distinction).
I think that the total number of comments that would be affected by
this approach is quite low.
--
Peter Geoghegan
problems without any real benefit. OTOH, calling
two closely related but distinct things by the same name is atrocious.
--
Peter Geoghegan
't know that much about, whereas off_t is used all over the
backend code.
--
Peter Geoghegan
have done so).
I am interested in making the code less complicated. If we can remove
the work_mem kludge for Windows as a consequence of that, then so much
the better.
--
Peter Geoghegan
or not *fully* banning the use of "long" is something that
will simplify the code is debatable. However, we could substantially
reduce the use of "long" across the backend without any real downside.
The work_mem question can be considered later. Does that seem
reasonable?
--
Peter Geoghegan
ng int64, and
being to support negative integers can be useful in some contexts
(though not this context).
--
Peter Geoghegan
lsewhere.
I was aware of that, but I wasn't aware of how many places that bleeds
into until I checked just now.
It would be nice if we could figure out how to make it obvious that
the idioms around the use of long for work_mem stuff are idioms that
have a specific rationale. It's pretty confusing as th
ure same can happen as well.
I believe that the test coverage of GiST index builds is something
that is being actively worked on right now. It's a recognized problem
[1].
[1] https://postgr.es/m/24954.1556130...@sss.pgh.pa.us
--
Peter Geoghegan
On Mon, May 6, 2019 at 5:15 PM Peter Geoghegan wrote:
> VACUUM asserts P_FIRSTDATAKEY(opaque) > PageGetMaxOffsetNumber(page)
> within _bt_mark_page_halfdead(), but doesn't test that condition in
> release builds. This means that the earliest modifications of the
> right page, befo
On Mon, May 6, 2019 at 4:11 PM Peter Geoghegan wrote:
> The important question is how VACUUM will recognize it. It's clearly
> not as bad as something that causes "failed to re-find parent key"
> errors, but I think that VACUUM might not be reclaiming it for the FSM
> (
o more, the actual business of performing a
split has no reason to change that I can think of. I would like to
keep _bt_split() as simple as possible anyway -- it should only be
copying tuples using simple primitives like the bufpage.c routines.
Living with what we have now (not using a temp buffer for the right
page) seems fine.
--
Peter Geoghegan
that was introduced in Postgres 11, and made much
more likely in practice in Postgres 12. (I haven't got as far as doing
an analysis of the risks to page deletion, though. The "fastpath"
rightmost page insertion optimization that was also added to Postgres
11 seems like it also might need to b
On Mon, May 6, 2019 at 12:48 PM Peter Geoghegan wrote:
> On the other other hand, it seems to me that the PageGetTempPage()
> thing might have been okay, because it happens before the high key is
> inserted on the new right buffer page. The same cannot be said for the
> way we generat
on't get that error, because we're talking about a
corrupt index, and all bets are off -- no reason to think that the
half-dead internal page would be consistent with other pages in any
way. But even then, you'll go on to report it in the usual way, since
VACUUM scans nbtree indexes in physical order.
--
Peter Geoghegan
lot of typing each time, as in:
:ea SELECT ...
--
Peter Geoghegan
On Mon, May 6, 2019 at 4:11 PM Peter Geoghegan wrote:
> I am tempted to move the call to _bt_truncate() out of _bt_split()
> entirely on HEAD, possibly relocating it to
> nbtsplitloc.c/_bt_findsplitloc(). That way, there is a clearer
> separation between how split points are ch
On Wed, Apr 17, 2019 at 7:37 PM Peter Geoghegan wrote:
> Tentatively, I think that the fix here is to not "itup_key->scantid =
> NULL" when a NULL value is involved (i.e. don't "itup_key->scantid =
> NULL" when we see IndexTupleHasNulls(itup) within _bt_doins
ute was NULL. It should only do that when a key attribute is
NULL.
--
Peter Geoghegan
On Thu, Apr 18, 2019 at 8:13 PM Peter Geoghegan wrote:
> It just occurred to me that the final patch will need to be more
> careful about non-key attributes in INCLUDE indexes. It's not okay for
> it to avoid calling _bt_check_unique() just because a non-key
> attribute was NULL. It
On Tue, Apr 30, 2019 at 6:28 PM Peter Geoghegan wrote:
> Attached is a much more polished version of the same patch. I tried to
> make clear how the "page full" test (the test that has been fixed to
> take heap TID space for high key into account) is related to ot
unlikely. That's why I would like to understand the problem that you
found with the check.
--
Peter Geoghegan
cies
even with such a large index. I admit that its unfriendly that users
are not warned about the shortage currently, but that is something we
can probably find a simple (backpatchable) fix for.
--
Peter Geoghegan
ernal pages.
I think that the simple answer to your question is yes. It would be
more complicated that way, and the only extra check would be the check
of high keys against their parent, but overall this does seem
possible.
--
Peter Geoghegan
off_t" is only 32-bits on Windows, which broke parallel CREATE
INDEX (issued fixed by commit aa551830). I suppose that "off_t" is
really a variant of the same problem.
--
Peter Geoghegan
practical, and without any real upside. But it
would be nice to identify integer types where there is a real risk of
making an incorrect assumption, and then eliminate that risk once and
for all.
--
Peter Geoghegan
, and should be avoided. ISTM that the
least confusing way of removing the ambiguity would be to no longer
refer to ItemIds as item pointers, without changing anything else.
--
Peter Geoghegan
On Fri, Apr 26, 2019 at 5:13 PM Peter Geoghegan wrote:
> On Fri, Apr 26, 2019 at 5:05 PM Tom Lane wrote:
> > Yeah, I'd be fine with that, although the disconnect between the type
> > name and the comment terminology might confuse some people.
>
> Maybe, but the fact that
On Fri, Mar 1, 2019 at 3:59 PM Peter Geoghegan wrote:
> /*
> * Perform the same check on this internal level that
> * _bt_mark_page_halfdead performed on the leaf level.
> */
> if (_bt_is_page_halfdead(rel, *rightsib
On Thu, Jul 4, 2019 at 10:38 AM Peter Geoghegan wrote:
> I tried this on my own "UK land registry" test data [1], which was
> originally used for the v12 nbtree work. My test case has a low
> cardinality, multi-column text index. I chose this test case because
> it
601 - 700 of 3085 matches
Mail list logo