But surely it should be possible to use DO NOTHING without inferring some
particular unique index? That's possible with an approach based on
inheritance.
--
Peter Geoghegan
(Sent from my phone)
DEX finishes or similar looks particularly
unappealing when one considers that the app was probably affected by
any corruption for weeks or months before the ICU update enabled its
detection.
[1] http://userguide.icu-project.org/collation/api#TOC-Sort-Key-Features
--
Peter Geoghegan
--
Sent via pg
or ICU, but doesn't hurt. */
I see that you have preserved strcoll() comparison caching (or should
I say ucol_strcollUTF8() comparison caching?), at the cost of having
to keep around a buffer which we must continue to copy every text
string into within varstr_abbrev_convert(). That was probably the
r
want to
get that out of the way now, since it needed to be fixed up by hand to
look reasonable. typedef list also updated.
--
Peter Geoghegan
0001-Add-amcheck-extension-to-contrib.patch.gz
Description: GNU Zip compressed data
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql
design of some
parts of contrib/isn is just horrible.
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Feb 9, 2017 at 2:47 PM, Peter Geoghegan <p...@bowt.ie> wrote:
>> which isn't an issue here, but reinforces my point about the (badly
>> documented) assumption that we don't release locks on user relations
>> early.
>
> You are right about the substantive issu
ert here, an issue like that
would presumably also be okay for parallel CREATE INDEX. It then
follows that what I'm missing here is something that is only really
needed for the parallel hash join patch anyway.
I really want to help Thomas, and am not shirking what I feel is a
responsibility to assi
d
practice/coding standards when they're very clearly not consistently
adhered to at all. You should at least say "let's not make a bad
situation any worse", or something, so that I don't need to spend 10
minutes pulling my hair out.
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
;for
> index ..." bits and use an errcontext for that instead, which'd be
> included in all messages.
Doesn't seem worth it to me.
>> + /*
>> + * General notes on concurrent page splits and page deletion:
>> + *
>> + * Routines like _bt_search() don't require *any* pag
standard idiom, I
think.
> If there is a problem with just requesting 8 bytes, then I'm wondering
> how this would affect the ICU code branch.
This must be fine with ICU's ucol_nextSortKeyPart(), because it is
designed for the express purpose of producing only a few bytes of the
final bl
incur
random I/O, often completely random I/O, but by and large it would be
a matter of swallowing that cost sooner, through using your tool,
rather than later, during the execution of queries.
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make
That's the thing that
increases at logarithmic intervals as there is a linear increase in
the number of workers assigned to the operation (so it's not the size
of the underlying table).
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to yo
On Mon, Jan 30, 2017 at 9:15 PM, Peter Geoghegan <p...@bowt.ie> wrote:
>> IIUC worker_wait() is only being used to keep the worker around so its
>> files aren't deleted. Once buffile cleanup is changed to be
>> ref-counted (in an on_dsm_detach hook?) then workers migh
e need to make an automated checker tool a requirement
for very complicated development projects in the future. We're behind
here.
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
hes, etc. Those are pretty clear
> bugs, and are reported by users.
I meant that I find the fact that there were no user reports in all
these years to be a good reason to not proceed for now in this
instance.
I wrote amcheck to detect the latter variety of bug, so clearly I
think that they are very s
wrong, the
> effects of our mistake could easily be a lot more serious than the
> original bug.
+1. The fact that it wasn't driven by a user report convinces me that
this is the way to go.
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make c
grow
linearithmically, whereas writing out runs has costs that grow
linearly. The relative cost of the I/O can be expected to go down as
input goes up for this reason. At the same time, a larger input might
make better use of I/O parallelism, which reduces the cost paid in
latency to write out runs
was the final nail in the coffin of
replacement selection. I certainly don't want to relitigate the
discussion on replacement_sort_tuples, and am not going to push too
hard, but ISTM that we should fully remove replacement selection from
tuplesort.c and be done with it.
--
Peter Geoghegan
--
Sent via
no longer work at Heroku, and so can do very little about it now,
but I found out privately that disabling parallel query made the
problem go away. It hasn't returned as of today. Parallel query was
disabled this whole time.
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers
ea of exploiting presortedness, and
to some extent the original algorithm does that (by using insertion
sort), but an optimization along the lines of Timsort's "galloping
mode" (which is what this modification of ours attempts) requires
non-trivial bookkeeping to do right.
--
Peter Geogh
t revision?
That is the plan. I need to get set up with a new machine here, having
given back my work laptop to Heroku, but it shouldn't take too long.
Thanks for the review.
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Jan 30, 2017 at 8:46 PM, Thomas Munro
<thomas.mu...@enterprisedb.com> wrote:
> On Wed, Jan 4, 2017 at 12:53 PM, Peter Geoghegan <p...@heroku.com> wrote:
>> Attached is V7 of the patch.
>
> I am doing some testing. First, some superficial things from first pass:
On Mon, Jan 30, 2017 at 10:01 AM, Peter Geoghegan <p...@bowt.ie> wrote:
> Let me take another look at this later today before proceeding. I want
> to run it against a custom test suite I've developed.
I've done so. Some more thoughts:
* I don't think that this is really any less ef
fine, then.
Let me take another look at this later today before proceeding. I want
to run it against a custom test suite I've developed.
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
ent the end of a run for tuplesort's purposes. That makes
me doubtful that this is any more robust or general than what I
proposed. So, I don't have a problem with the performance implications
of doing this, which should be minor, but I'm concerned that it
appears to be more general than it actually i
was not an easy decision for me to leave Heroku, but I felt it was
time for a change. I am very grateful to have had the opportunity. I
have learned an awful lot during my time at the company. It has been
excellent to have an employer that has been so supportive of my work
on Postgres this whole ti
On Wed, Jan 25, 2017 at 1:22 PM, Peter Geoghegan <p...@heroku.com> wrote:
> I understand that my experience with storage devices is unusually
> narrow compared to everyone else here. That's why I remain neutral on
> the high level question of whether or not we ought to e
keep users from
> losing data.
Wouldn't that have issues with torn pages?
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Jan 25, 2017 at 3:11 PM, Tom Lane <t...@sss.pgh.pa.us> wrote:
> Please. You might want to hit the existing ones with a separate patch,
> but it doesn't much matter; I'd be just as happy with a patch that did
> both things.
Got it.
--
Peter Geoghegan
--
Sent vi
instead.
> There are several other uses of "call here", both in this patch and
> pre-existing in tuplesort.c, that I find equally vague and unsatisfactory.
> Let's try to improve that.
Should I write a patch along those lines?
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
t of customer databases. Not even once.
This is not what I would have expected myself several years ago.
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
eed to invest in corruption detection/verification tools that are
run on an as-needed basis. They are available to users of every other
major database system.
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://
ositive occurred, that might
> help alert you to underlying storage problems, but it isn't helping you
> with respect to being able to access your perfectly valid data.
It was a terminology problem. Thank you for the clarification.
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing li
ch a bug in recovery itself, even
when the filesystem maintains the guarantees Postgres requires.
In any case, it seems exceedingly unlikely that the checksum code
itself would fail.
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
below the filesystem. We
clearly have not done that, so ISTM that checksums could legitimately
find bugs in the checksum code. I am not being facetious.
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
ble downside. I'd like to see a benchmark.
[1] http://www.hpl.hp.com/techreports/tandem/TR-85.7.pdf
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
, an address from the userland program (Postgres)
address space, so GDB does in fact display interesting disassembly --
the documentation on this seems extremely limited.)
Note that there is a text array column in the table "albums". I note
that there have been fixes in this area already, that a
issue, or even the main issue, but I'm fairly
suspicious of the fact that cost_sort() doesn't distinguish between
the comparison cost of text and int4, for example.
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your sub
is
worthwhile. I just think that you may have somewhat underestimated how
much this could help when we're low on memory, but not ridiculously
so. It doesn't seem necessary to prove this, though. (We need only
verify that there are no regressions.)
That's all I have right now.
[1] https://commitfest.postg
here were concerns about the overhead of an external sort
test on slower buildfarm animals.)
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
lack a test-for-NULL-argument needed
by pass-by-value datum cases. The other two RELEASE_SLAB_SLOT() calls
already have such a check.
Attached patch fixes the bug.
--
Peter Geoghegan
From ce24bff1aad894b607ee1ce67757efe72c5acb93 Mon Sep 17 00:00:00 2001
From: Peter Geoghegan <p...@bowt.ie>
a way to
combine INSERT, UPDATE, and DELETE into one convenient DML statement".
MERGE is most compelling when performing bulk loading. That being the
case, in my mind MERGE remains something that we really haven't turned
our back on at all.
--
Peter Geoghegan
--
Sent via pgsql-hackers m
gresql.org/wiki/UPSERT#MERGE_disadvantages
[2]
https://www.postgresql.org/message-id/CAM3SWZRP0c3g6+aJ=yydgyactzg0xa8-1_fcvo5xm7hrel3...@mail.gmail.com
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www
as needed (see how I generate BufFileOps for an idea of what I
mean if it's not immediately clear). That's also an easy change, or at
least will be once the refcount thing is added.
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
d a lot of time for
> PostgreSQL work.
>
> For that reason, as of today, I am stepping down from the PostgreSQL
> Core Team.
Thank you Josh!
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Jan 11, 2017 at 12:05 PM, Robert Haas <robertmh...@gmail.com> wrote:
> On Wed, Jan 11, 2017 at 2:20 PM, Peter Geoghegan <p...@heroku.com> wrote:
>> You'd probably still want to throw an error when workers ended up not
>> deleting BufFile segments they owned, th
On Wed, Jan 11, 2017 at 11:20 AM, Peter Geoghegan <p...@heroku.com> wrote:
>> If multiple processes are using the same file via the BufFile
>> interface, I think that it is absolutely necessary that there should
>> be a provision to track the "attach count"
On Wed, Jan 11, 2017 at 10:57 AM, Robert Haas <robertmh...@gmail.com> wrote:
> On Tue, Jan 10, 2017 at 8:56 PM, Peter Geoghegan <p...@heroku.com> wrote:
>> Instead of all this, I suggest copying some of my changes to fd.c, so
>> that resource ownership within fd.c d
+++ b/src/backend/executor/nodeSeqscan.c
> @@ -31,6 +31,8 @@
> #include "executor/nodeSeqscan.h"
> #include "utils/rel.h"
>
> +#include
> +
> static void InitScanRelation(SeqScanState *node, EState *estate, int eflags);
> static TupleTableSlot *SeqNext(SeqScanStat
S strxfrm() shouldn't be a problem with ICU).
Otherwise, it could do some kind of testing on our pg_strxfrm()
wrapper (or similar).
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailp
On Mon, Jan 9, 2017 at 12:25 PM, Peter Eisentraut
<peter.eisentr...@2ndquadrant.com> wrote:
> On 1/7/17 10:01 PM, Peter Geoghegan wrote:
>> It occurs to me that the comparison caching stuff added by commit
>> 0e57b4d8b needs to be considered here, too.
> Tha
ambiguity
that that paper goes into. C may have contradictory goals, but that
doesn't mean they're the wrong goals, even when considered as a whole.
The culture that C is steeped in still makes a lot of sense for a
system like Postgres.
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing lis
tself at one point, built right into
strcoll(), but it was subsequently disabled.)
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
be very unlucky to have things line up in exactly the wrong way
with 1GiB BufFile segments, but it can still happen.)
--
Peter Geoghegan
From d4a611e94a3b4504f034fdb31a8ab4a72955b887 Mon Sep 17 00:00:00 2001
From: Peter Geoghegan <p...@bowt.ie>
Date: Thu, 5 Jan 2017 11:29:53 -0800
S
op-N heapsorts here). That has perhaps
unbeatable efficiency, while also helping cases with significant
physical/logical correlation in their input, which is pretty common.
Creating an index on a serial PK within pg_restore would probably get
notably faster if we went this way.
--
Peter Geoghega
On Tue, Dec 20, 2016 at 5:14 PM, Peter Geoghegan <p...@heroku.com> wrote:
>> Imagine a data structure that is stored in dynamic shared memory and
>> contains space for a filename, a reference count, and a mutex. Let's
>> call this thing a SharedTemporaryFile or something
artially loaded the hash table in ExecHashJoinPreloadNextBatch.
> +*/
> + Assert(hashtable->batch_reader.batchno = curbatch);
> + Assert(hashtable->batch_reader.inner);
Obviously this isn't supposed to be an assignment.
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (
ely on the tape freezing code to flush the last block out to make
state available in temp files for the leader to process/merge. The
memory savings that remain on the table are probably not measurable if
we were to fix them, given the work we've already done, palloc()
fragmentation, and so on.
--
On Wed, Dec 21, 2016 at 10:21 AM, Peter Geoghegan <p...@heroku.com> wrote:
> On Wed, Dec 21, 2016 at 6:00 AM, Robert Haas <robertmh...@gmail.com> wrote:
>> 3. Just live with the waste of space.
>
> I am loathe to create a special case for the parallel interface too,
to use so little memory in the first place; this is a
corner case.
[1]
https://www.postgresql.org/message-id/CAM3SWZR+ATYAzyMT+hm-Bo=1l1smtjbndtibwbtktyqs0dy...@mail.gmail.com
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your su
as
proposed. I didn't go that far in part because it seemed premature,
given that nobody had looked at my work to date at the time, and given
the fact that there'd be no initial user-visible benefit, and given
how the exact meaning of "unification" was (and is) somewhat in flux.
I see no good
dea seems uncontroversial.
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
sting of 0001-* alone. Oops.
Attached revision of 0001-* fixes this. A revised 0002-* is also
attached, just as a convenience for reviewers (they won't need to
resolve the conflict themselves).
--
Peter Geoghegan
From 541a7f0ae6060763cd4448359159c1a2c5980a68 Mon Sep 17 00:00:00 2001
From: Peter Geogheg
t=3405052.45..3676948.66 rows=320 width=32)
(actual time=21165.849..37814.551 rows=1357812 loops=4)
Is this the best test case to show off the patch? This node is the
immediate outer child of a Nested Loop Semi Join, and so I'm concerned
that we measuring the wrong thing.
--
Peter Geoghegan
--
Se
On Fri, Oct 21, 2016 at 4:45 PM, Peter Geoghegan <p...@heroku.com> wrote:
> More importantly, there are no remaining cases where
> tuplesort_gettuple_common() sets "*should_free = true", because there
> is no remaining need for caller to *ever* pfree() tuple. Moreover,
nd any breakage would be
> straightforward to fix.
I don't think so. I'm not aware of any third party extensions that
call tuplesort.c routines, despite having looked for them. I noticed
that pg_repack doesn't. For any that do, they'll break in a
predictable, obvious way.
--
Peter Geoghegan
te motivating example would be nice. For example, it
would be nice to see the overall speedup for some particular TPC-H
query.
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
on how
parallel CREATE INDEX should handle the ecosystem of tools like
pg_restore, reindexdb, and so on. Personally, I'm neutral on which
general approach should be taken. Proposals from other hackers about
what to do here are particularly welcome.
--
Peter Geoghegan
--
Sent via pgsql
it might be invoked).
In general, I have a positive outlook on this patch, since it appears
to compete well with similar implementations in other systems
scalability-wise. It does what it's supposed to do.
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To
On Mon, Nov 7, 2016 at 8:28 PM, Peter Geoghegan <p...@heroku.com> wrote:
> What do we need to teach pg_restore about parallel CREATE INDEX, if
> anything at all? Could this be as simple as a blanket disabling of
> parallelism for CREATE INDEX from pg_restore? Or, does it ne
ve any comments on the patch,
> please move the patch into "ready for committer" state to get committer's
> attention. This will help us in smoother operation of commitfest.
Sorry for the delay on this.
I agree with Robert's remarks today on TupleTableSlot, and would like
to see a revi
ight incorporate it into my own testing in the
future.
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
d may be better. OTOH, under the mostly-INSERT
> workload (like data warehouse?), the current method will be better because it
> writes no log for UNDO.
I believe that you are correct about that.
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To
On Tue, Nov 22, 2016 at 8:45 PM, Tom Lane <t...@sss.pgh.pa.us> wrote:
> Peter Geoghegan <p...@heroku.com> writes:
>> The best thing by far about an alternative design like this is that it
>> performs *consistently*.
>
> Really? I think it just moves the issues
it exceptional
difficult.
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
S paper says
nothing about MVCC.
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
much worse for modern hardware.
I imagine that temporal locality helps a lot. Most snapshots will be
interested in the same row version.
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
ould be good, too!
I was told that TED was kind of formally proposed in the "MultiXact
hindsight design" thread.
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
eaf page directly (particularly for non-HOT UPDATEs). That's
what I'm mostly interested in investigating, here.
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
at, all feedback items were worked through. I made the
functions PARALLEL SAFE, too, since I noticed that that wasn't the
case in passing.
--
Peter Geoghegan
From 21c843ae0193ed17b2e5234d67f2e73f7015b3cd Mon Sep 17 00:00:00 2001
From: Peter Geoghegan <p...@bowt.ie>
Date: Tue, 10 Jun 2014 22:
be we can
lose that at some point.
You should use amcheck to specifically verify that that happens
reliably in all cases. Presumably, its use of an insertion scankey
would automatically see the use of TID as a tie-breaker with patched
Postgres amcheck verification, and so amcheck will work for this
purpose unmodified.
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
e an elevel that is != ERROR (the thing I
mention about elevel < ERROR is already documented in code comments).
If that breaks, they get to keep both halves.
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
of I/O during the scan against
how much we might expect to save in a subsequent bitmap heap scan, and
so on. This might be based on a selectivity estimate.
That's all fairly hand-wavy, certainly, but I see significant
potential along these lines.
--
Peter Geoghegan
--
Sent via pgsql-hackers mail
On Thu, Aug 18, 2016 at 2:15 PM, Peter Geoghegan <p...@heroku.com> wrote:
> I think that this is a bad idea. We need to implement suffix
> truncation of internal page index tuples at some point, to make them
> contain less information from the original leaf page index tuple.
> T
On Thu, Nov 17, 2016 at 12:04 PM, Peter Geoghegan <p...@heroku.com> wrote:
>> Hm, if we want that - and it doesn't seem like a bad idea - I think we
>> should be make it available without recompiling.
>
> I suppose, provided it doesn't let CORRUPTION elevel be < ER
classic
L "Ki < v <= Ki+1", to avoid bloat in the internal pages and to make
suffix truncation in internal pages work.
So, we don't have the cousin problem, but since I wish for us to adopt
the stricter classic L invariant at some point, you could say that I
also wished that we had the
he parent level, we cannot merge the key space
of a page into its right sibling unless the right sibling is a child of
the same parent --- otherwise, the parent's key space assignment changes
too, meaning we'd have to make bounding-key updates in its parent, and
perhaps all the way up the tree. Since we can't possibly do that
atomically, we forbid this case. That means that the rightmost child of a
parent node can't be deleted unless it's the only remaining child, in which
case we will delete the parent too (see below).
''""
> I like this. Some of the more complex pieces towards the end of the
> field need some attention, there's a fair amount of word-smithing
> needed, and I do think we want to make the structural changes outlined
> above, but besides these, imo fairly simple adaptions, I do think this
> is useful and not that far from being committable.
Cool. I'll try to get a revision out soon. I'm happy to do that much.
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
be clear, I don't think that there is reason to tie it to adding the
PinBuffer() stuff, which we've been talking about for years now. It
just caught my eye.
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postg
On Mon, Nov 14, 2016 at 9:22 AM, Heikki Linnakangas <hlinn...@iki.fi> wrote:
> I think that difference in the API is exactly what caught Peter by surprise
> and led to bug #14344. And I didn't see it either, until you two debugged
> it.
That is accurate, of course.
--
n too far or we'll create slow polyphase
> merges in case that are reasonably likely to occur in real life.
I completely agree with your analysis.
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Nov 9, 2016 at 4:54 PM, Peter Geoghegan <p...@heroku.com> wrote:
> It's more complicated than that. As I said, I think that Knuth
> basically had it right with his sweet spot of 7. I think that commit
> df700e6b40195d28dc764e0c694ac8cef90d4638 was effective in large part
ied my
best to balance here with a merge order of 500.
> Sound OK?
I'm fine with not mentioning Knuth's sweet spot once more. I guess
it's not of much practical value that he was on to something with
that. I realize, on reflection, that my understanding of what's going
on is very nuanced.
Thanks
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
ially asking the parallel infrastructure to not care. I think that
this works fine, given the limited scope of the problem, but it would
be nice to have that confirmed.
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscr
On Mon, Oct 24, 2016 at 6:17 PM, Peter Geoghegan <p...@heroku.com> wrote:
>> * Cost model. Should probably attempt to guess final index size, and
>> derive calculation of number of workers from that. Also, I'm concerned
>> that I haven't given enough thought to the low
the key space in the
least restrictive way possible, by applying suffix truncation so that
it's much more likely that things will *stay* balanced as churn
occurs. This is probably a really bad problem with things like
composite indexes over text columns, or indexes with many NULL values.
--
Pete
t;I don't see the point" seems to be
> ignoring the explanations already given.
+1. I strongly agree.
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
plied to the right kind of reloptions.
It seems worth adding an assertion, at least. I wonder what running
the regression tests with a bunch of similar assertions shows up...
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your
On Wed, Oct 19, 2016 at 11:33 AM, Peter Geoghegan <p...@heroku.com> wrote:
> I don't think that eager merging will prove all that effective,
> however it's implemented. I see a very I/O bound system when parallel
> CREATE INDEX merges serially. There is no obvious reason w
On Wed, Aug 17, 2016 at 4:12 PM, Peter Geoghegan <p...@heroku.com> wrote:
> During preliminary analysis of what it would take to produce a
> parallel CLUSTER patch that is analogous of what I came up with for
> CREATE INDEX, which in general seems quite possibl
On Mon, Aug 1, 2016 at 3:18 PM, Peter Geoghegan <p...@heroku.com> wrote:
> Setup:
>
> CREATE TABLE parallel_sort_test AS
> SELECT hashint8(i) randint,
> md5(i::text) collate "C" padding1,
> md5(i::text || '2') collate "C" padding2
401 - 500 of 3415 matches
Mail list logo