On Fri, May 4, 2018 at 2:39 AM, Teodor Sigaev <teo...@sigaev.ru> wrote:
> Thank you, pushed.
> or only if the item wasn't found in cache. There's some coherency
> differences, obviously.
+1. This would be a good option for amcheck, too.
page. Therefore I was more thinking to falling back to smgrread() or
I'm not envisaging anything specific just yet. It would be nice if
amcheck had an option that bypassed shared_buffers, because users want
that. That's all.
e to focus on making this
accessible to backup tools, that should ideally work without needing
to acquire any MVCC snapshot. Probably from a front-end utility. We'd
need to export at least some operator class functionality to make that
On Thu, May 3, 2018 at 5:16 PM, Peter Geoghegan <p...@bowt.ie> wrote:
> On Thu, May 3, 2018 at 12:37 PM, Robert Haas <robertmh...@gmail.com> wrote:
>> On Wed, May 2, 2018 at 3:06 PM, Vladimir Sitnikov
>> <sitnikov.vladi...@gmail.com> wrote:
little point in changing
> the error handling, still it would be good to get test coverage. That's
> not really a bug though as in all those cases we don't get errors like
> "could not open file" or such. So we could also let things as they are
Now that the relkind iss
to make it more
accessible, though it's already quite accessible.
On Fri, May 11, 2018 at 11:46 PM, Christoph Berg <m...@debian.org> wrote:
> Re: Peter Geoghegan 2018-05-12
DD COLUMN sets a default value for a
column". Referencing the fact that Postgres was previously able to do
this with a NULL default doesn't seem to add anything.
On Fri, May 11, 2018 at 12:17 PM, Peter Geoghegan <p...@bowt.ie> wrote:
> * Suggest replacement sort item be phrased as: "Remove the
> configuration parameter replacement_sort_tuples. The
> replacement selection sort algorithm is no longer used."
Also, it should be moved
ter replacement_sort_tuples. The
replacement selection sort algorithm is no longer used."
debating something that's inconsequential?
FWIW, I am neutral on the important question of whether or not this
patch should actually be back-patched.
to do this stuff without the user having to read a tedious explanation
of how the different parts fit together.
don't you start a thread about it? I'd actually even suggest
> backpatching it.
PostGIS has one of these too, which covers several languages,
including SQL. Perhaps it would be worth tightening up the indentation
of SQL files while we're at it.
That change was also left out. If I'm mistaken in how I interpreted
the sentence, then Andres should follow up.
On Sat, Jun 9, 2018 at 10:40 AM, Daniel Gustafsson wrote:
> Thanks, I’m honoured to get the first commit.
Keep them coming. :-)
el as they are matched/killed. It should be safe to avoid
rechecking anything other than the heap TID values.
(unless perhaps you assume that that problem is
solved by something else, such as zheap). The mechanism that Andrey
describes is rather unlike VACUUM as we know it today, but that's the
on't quite see a clear path to making this patch useful.
But, as I said, I have an open mind about what the next step should
e we at least
know that the deleting transaction committed there. That is, they
could be recently dead or dead, and it may be worth going to the extra
trouble of checking which when we know that it's one of the two
On Mon, Jun 18, 2018 at 10:21 AM, Alvaro Herrera
> One which includes at least half a working day in a different timezone.
> You asked mid-afternoon on a Friday in a timezone pretty far west.
It was 11 am PST.
I'll make a note about this. It won't happen again.
> become a new unreserved keyword.
> So, do we want this feature? If we do I'll go ahead and prepare a patch.
I know that it's definitely a feature that I want. Haven't thought
about the syntax, though.
collector stats masking a problem, the possibility that there are very
important queries that do use the index but are only run very
infrequently, and so on.
warning: ‘blkno’ may be used uninitialized in this function
cleanup_block(info, stats, blkno);
I think that they're both harmless, though.
On Mon, Jun 18, 2018 at 2:54 PM, Peter Geoghegan wrote:
> On Sun, Jun 17, 2018 at 9:39 PM, Andrey V. Lepikhov
>> Patch '0001-retail-indextuple-deletion' introduce new function
>> amtargetdelete() in access method interface. Patch
eature. Oracle has a similar feature.
On Mon, Jun 18, 2018 at 4:05 PM, Peter Geoghegan wrote:
> Finally, doing things this way would let you delete multiple
> duplicates in one shot, as I described in an earlier e-mail. Only a
> single descent of the tree is needed to delete quite a few index
> tuples, provided that the
On Mon, Jun 18, 2018 at 4:05 PM, Peter Geoghegan wrote:
> IOW, the approach you've taken in bttargetdelete() isn't quite correct
> because you imagine that it's okay to occasionally "lose" the index
> tuple that you originally found, and just move on. That needs to be
s issue is that gin indexes
> could return a correct result without no assertion failure even if it
> somewhat corrupted. So maybe having amcheck for gin indexes would
> resolve part of problems.
That seems like a really good idea. As you know, I have misgivings
about this area.
an also imagine someone
seeing something that I missed.
ue. I think we will need an option for that.
> I'm not yet sure if this option should be user-visible or
> auto-decision based on table access method used.
Description: Binary data
umns isn't appropriate. It seems sufficient to only
mention this once, in the CREATE INDEX docs.
Attached patch shows what I have in mind -- the total removal of this
From 293f1fc93771fa6be65867f53aa507fb30137230 Mon Sep 17 00:00:00 2001
From: Peter Geoghegan
ection was not well considered in the first place, I
thought that the question was clear cut, and I doubted that anyone
else would follow up at all.
e -- not just the current session.
't seem like anything we can't fix with acceptable risk.
I agree with both points.
aybe that alone will be enough to make the
patch worth committing; "opportunistic microvacuuming" can come later,
if at all.
to the heap tuple via an index scan
if there are no index tuples that stick around until "recently dead"
heap tuples become "fully dead"? How can you avoid keeping around both
old and new index tuples at the same time?
On Tue, Jun 19, 2018 at 4:06 AM, Pavan Deolasee
> I would like to resubmit the MERGE patch for PG12. The past discussions
> about the patch can be found here  .
FWIW, I'm really glad that you're going to work on this for v12.
the only one that doesn't use context diffs and also doesn't hate
them with a passion? I don't get it.
> There's already a (perhaps underdocumented) way to make regression
> diffs look the way you want in local runs.
Actually, it is documented.
* picksplit loop.
Perhaps the problem is in the range type SP-GiST opclass support code
- it could have this exact picksplit infinite loop problem. That's
perfectly consistent with what we see here.
On Sun, May 27, 2018 at 2:09 PM, Peter Geoghegan <p...@bowt.ie> wrote:
> On Sun, May 27, 2018 at 12:45 PM, Jonathan S. Katz <jk...@postgresql.org>
>> Next, see bad.sql. 1.2MM sparsely clustered rows inserted, GiST indexes
>> builds in about 30s on my ma
Congratulations to the other 6 new committers.
(Sent from my phone)
s of enforcement, etc.
Naturally, the rules across disparate groups vary widely for all kinds
of reasons. Formalizing and being more transparent about how this
works seems like the opposite of paternalism to me.
l screwed in this situation. It does finish eventually, but in about
> 50x longer than GiST. I imagine the index's query performance would be
> equally awful.
Can you think of some way of side-stepping the issue? It's unfortunate
that SP-GiST is potentially so sensitive to input order.
On Sun, May 27, 2018 at 5:24 PM, Peter Geoghegan <p...@bowt.ie> wrote:
> On Sun, May 27, 2018 at 5:10 PM, Tom Lane <t...@sss.pgh.pa.us> wrote:
>> Instrumenting the test case suggests that getQuadrant pretty much always
>> returns 1, resulting in a worst-case unbal
rting about 500 million records. Periodic pstack
> runs on Linux showed that the backend was busy in btree operations. I didn't
> pursue the cause due to other businesses, but there might be something to be
What kind of data was indexed? Was it a bigserial primary key, or
ral, page deletion is the most complicated part of nbtree
concurrency, by far (if we just had the basic L, the concurrency
aspects would be far easier to grasp). Doing better in
_bt_unlink_halfdead_page() seems extremely difficult, and very
unlikely to be worthwhile.
You're probably looking at a test-case that doesn't involve a
multi-level deletion, where the leaf and target page are the same. Not
sure if you want to reproduce the multi-level deletion case
On Wed, Jun 27, 2018 at 1:11 PM, Andres Freund wrote:
> Well, I don't really want to generally do better. Just be able to check
> for interrupts ;)
That's what I meant. :-)
unsure of that, though.
between _bt_delitems_delete() and
_bt_delitems_vacuum(), and the considerations for hot standby? I think
that that's another TODO list item for this patch.
On Fri, Jun 22, 2018 at 12:43 PM, Peter Geoghegan wrote:
> On Fri, Jun 22, 2018 at 4:24 AM, Andrey V. Lepikhov
>> According to your feedback, i develop second version of the patch.
>> In this version:
>> 1. High-level functions index_beginscan(), index_rescan() n
at my patch
has are also problems for your patch. One patch cannot get committed
without the other, so they are already the same project. As a bonus,
my patch will probably improve the best case performance for your
patch, since multi-deletions will now have much better locality of
SERIAL-like inputs, especially when parallelism is used. Back in 1999,
there was no WAL-logging.
From 2a77b947e07dadef953019334062580cad560836 Mon Sep 17 00:00:00 2001
From: Peter Geoghegan
Date: Sun, 24 Jun 2018 11:41:16 -0700
Subject: [PATCH] Remove obsolete comment
s of that technology.
I don't understand, but okay. I can provide feedback once a design for
delete marking is available.
On Thu, Jun 14, 2018 at 11:44 AM, Peter Geoghegan wrote:
> I attach an unfinished prototype of suffix truncation, that also
> sometimes *adds* a new attribute in pivot tuples. It adds an extra
> heap TID from the leaf level when truncating away non-distinguishing
> attributes during
On Mon, Jul 2, 2018 at 9:28 AM, Peter Geoghegan wrote:
>> Execution time of last "VACUUM test;" command on my notebook was:
>> with bulk deletion: 1.6 s;
>> with Quick Vacuum Strategy: 5.2 s;
>> with Quick Vacuum Strategy & TID sorting: 0.6 s.
ng any index at all, which Tom referred
to in passing. That's why I understand his remarks in this way.
assing the lowest heap TID in the list of TIDs to kill to
_bt_search() and _bt_binsrch(). You might have to work through several
extra B-Tree leaf pages per bttargetdelete() call without this (you'll
move right multiple times within bttargetdelete()).
This isn't a correctness issue, and the extra overhead of unneeded
truncation should be negligible, but what we have now seems confusing
Description: Binary data
> that using wall-clock time is the right way to solve that problem
> because it leads to edge effects
I agree that wall-clock time is a bad approach, actually. If we were
to use wall-clock time, we'd only do so because it can be used to
discriminate against a thing that we actually care about in an
approximate, indirect way. If there is a more direct way of
identifying correlated accesses, which I gather that there is, then we
should probably use it.
32 kB of memory left after that
>> (including, if it went below 0), then we bump availMem back up to 32 kB. So
>> we'd always reserve 32 kB to hold the tuples, even if that means that we
>> exceed 'work_mem' slightly.
> Sounds very reasonable.
e minimum additional
amount of memory should be big enough that you have some chance of
using all 1024 SortTuples (the whole memtuples array).
maybe a new preprocessor
state->memtupsize = Max(1024, ALLOCSET_SEPARATE_THRESHOLD /
sizeof(SortTuple) + 1);
The ALLOCSET_SEPARATE_THRESHOLD part can become a static assertion.
oint that the sort starts,
rting extremely wide tuples.
-1 from me. What about the case where only some tuples are massive?
or index pages and visibility map
> pages ought to show up as fiery hot on workloads where the index or
> visibility map, as the case may be, is heavily used.
I don't think that internal index pages and the visibility map are the
problem, since there is so few of them, and they are accessed at least
hundreds of times more frequently than the leaf pages.
lem is it that we waste memory bandwidth copying to
and from OS cache, particularly on large systems? Might that be the
bigger problem, but also one that can be addressed incrementally?
I think that I may be repeating some of what Andrey said, in another
way (not sure of that).
pg_stat_bgwriter.buffers_backend_fsync in the wild after the
compaction queue stuff was added/backpatched?
that generalizes from that won't have a
problem, but it would be nice if it had a friendlier message. I didn't
change amcheck to account for this myself because I thought it was
possible that somebody else would want to use a more general solution.
> or BufFileRead() on the shared BufFile handle before calling
> BufFileAppend(), it would get confused.
Seems worth fixing.
! NOTICE: drop cascades to 17 other objects
\set VERBOSITY terse
DROP SCHEMA collate_tests CASCADE;
! NOTICE: drop cascades to 20 other objects
On Thu, May 3, 2018 at 3:18 PM, Tom Lane <t...@sss.pgh.pa.us> wrote:
> Peter Geoghegan <p...@bowt.ie> writes:
>> Why should the drop cascade to 63 objects rather than 62 because I've
>> set ignore_system_indexes=on?
> Indeed, that seems weird. Maybe tweak the
On Thu, May 3, 2018 at 4:14 PM, Peter Geoghegan <p...@bowt.ie> wrote:
> On Thu, May 3, 2018 at 3:18 PM, Tom Lane <t...@sss.pgh.pa.us> wrote:
>> Indeed, that seems weird. Maybe tweak the test scripts so you can see
>> all the objects cascaded to, and then find out wh
esn't justify a
> separate usage count bump each time.
I don't have time to check this out just now, but it seems like an
excellent idea. It would be nice if it could be enhanced further, so
you get some idea of what the blocks are without having to decode them
yourself using tools like pageinspect.
> will be reported as deletion targets. And it's not surprising that
> disabling indexscans on pg_depend changes the visitation order.
I also noticed that there are cases where we see less helpful (though
still technically correct) HINT messages about which other object the
user may prefer to drop.
e of a problem with
miscounting the number of hits based on accidents around how things
are structured in the executor -- we need to be more careful about
recognizing whether or not block hits are logically distinct.
Particularly with stuff like nested loop joins.
er on a view. I'm too tired to figure out whether or
not this is truly cause for concern right now, though
ler, the costs associated with index descent becomes higher.
FWIW, I think that int4 indexes have to be extremely large before they
exceed 4 levels (where the leaf level counts as a level). My
conservative back-of-an-envelope estimate is 9 billion index tuples.
est and most important
parts of successfully contributing to PostgreSQL. It's usually
essential to understand in detail why the thing that you're thinking
of working on doesn't already exist. The TODO list seems to suggest
almost the opposite, and as such is a trap for inexperienced hackers.
it to a high standard then it probably would have happened already.
The fact that it hasn't happened tells us plenty.
> join done one partition at a time, but a leader-wise logical tape set
> is not done one leader at a time. If there's another meaning to the
> affix -wise, I'm not familiar with it. Don't we just mean "a single
> logical tapeset managed by the leader"?
Yes, we do. Will change.
> There's a lot here I haven't grokked yet, but I'm running out of
> mental energy so I think I'll send this for now and work on this some
> more when time permits, hopefully tomorrow.
The good news is that the things that you took issue with were about
what I expected you to take issue with. You seem to be getting through
the review of this patch very efficiently.
On Wed, Jan 10, 2018 at 1:31 PM, Robert Haas <robertmh...@gmail.com> wrote:
>> Can we actually call it max_parallel_maintenance_workers instead?
>> I mean we don't have work_mem_maintenance.
> Good point.
is the main reason why external
sorts can be faster than internal sorts -- this happens fairly
frequently these days, especially with CREATE INDEX, where being able
to write out the index as it merges on-the-fly helps a lot.)
ot; comes from the
idea of having a leader-wise space following concatenating worker
tapes (who have original/worker-wise space). We must apply an offset
to get from a worker-wise offset to a leader-wise offset.
This made more sense in an earlier version. I overlooked this during
recent self review.
rial case too, but much less so.
That's really not why he did it.
> I assume the author credit will be "Peter
> Geoghegan, Rushabh Lathia" in that order, but let me know if anyone
> thinks that isn't the right idea.
"Peter Geoghegan, Rushabh Lathia" seems right. Tho
> knows where all the variables are. In my experience, inlining hurts
> both of those things, which is why I'm saying that forcing inlining
> even in non-optimized builds is a bad idea.
Isn't that an argument against inlining in general, rather than
forcing inlining in particular?
an for waiting to hear other people's views
before proceeding. It really needs to be properly discussed.
the most convenient
way of using L's technique to defer recycling until it is definitely
safe. We only need to make sure that _bt_page_recyclable() cannot
become confused by XID wraparound to fix this problem -- that's it.
helpful unless used thoughtfully.
detecting problems mechanically. Rebasing a patch without conflicts
(including seeing a warning about offsets) does not mean that your
patch didn't become broken in some subtle, harmful way. Mechanical
detection is only useful to the extent that it guides and augments
atch, and check its
progress, to make sure that what I add is comparable to what
ultimately gets committed for parallel query.
perhaps only secondarily
about the question of ripping out parallel_leader_participation
> Can you please elaborate what part of optimizer are you talking about
> where without leader participation partial path will always lose to a
> serial sequential scan path?
See my rema
e" comment within
InitializeParallelDSM() should instead say something like "Serialize
Other than that, looks good to me. It's a simple patch with a clear purpose.
On Thu, Jan 18, 2018 at 10:27 AM, Robert Haas <robertmh...@gmail.com> wrote:
> On Thu, Jan 18, 2018 at 1:14 PM, Peter Geoghegan <p...@bowt.ie> wrote:
>> That seems pretty far fetched.
> I don't think it is, and there are plenty of other examples. All you
allow a degenerate parallel CREATE INDEX in the next version. I think
that it will make parallel_leader_participation less useful, with no
upside, but there doesn't seem to be much more that I can do about
On Sun, Jan 14, 2018 at 8:25 PM, Amit Kapila <amit.kapil...@gmail.com> wrote:
> On Sun, Jan 14, 2018 at 1:43 AM, Peter Geoghegan <p...@bowt.ie> wrote:
>> On Sat, Jan 13, 2018 at 4:32 AM, Amit Kapila <amit.kapil...@gmail.com> wrote:
>>> Yeah, but this would mean
t there is this issue with
nworkers_launched until very recently. But then, that field has
absolutely no comments.
ry and on slave.
> Also, I've checked bt_index_parent_check() on master.
I fixed that recently. It should be fine now.
1 - 100 of 529 matches
Mail list logo