Re: [HACKERS] WIP: Covering + unique indexes.

2017-04-04 Thread Peter Geoghegan
in a failure to find a split point, which there are many defenses against already, but I think I would find that difficult to prove. The intent of the code is almost as important as the code, at least in my opinion. [1] postgr.es/m/CAH2-Wz=vmdh8pfazx9wah9bn5ast5vrna0xsz+gsfrs12bp...@mail.gmail.com [2] postgr.es/m/11895.1490983884%40sss.pgh.pa.us -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Covering + unique indexes.

2017-04-02 Thread Peter Geoghegan
On Thu, Mar 30, 2017 at 8:26 AM, Teodor Sigaev wrote: > Any objection from reviewers to push both patches? I object. Unfortunately, it seems very unlikely that we'll be able to get the patch into shape in the allotted time before feature-freeze, even with the 1 week extension. -

Re: [HACKERS] WIP: Covering + unique indexes.

2017-04-02 Thread Peter Geoghegan
On Fri, Mar 31, 2017 at 4:31 PM, Peter Geoghegan wrote: > That's all I have for now. Maybe I can look again later, or tomorrow. I took another look, this time at code used during CREATE INDEX. More feedback: * I see no reason to expose _bt_pgaddtup() (to modify it to not be static, so i

Re: [HACKERS] WIP: Covering + unique indexes.

2017-03-31 Thread Peter Geoghegan
Y(ropaque)); left_hikey = PageGetItem(rpage, hiItemId); left_hikeysz = ItemIdGetLength(hiItemId); } It seems like this was missed when you changed WAL-logging, since you do something for this on the logging side, but not here, on the replay side. No? That's all I have

Re: [HACKERS] Partitioning vs ON CONFLICT

2017-03-31 Thread Peter Geoghegan
d it up once, I'm inclined to think we should postpone > this to v11, think it over some more, a Fine by me. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Partitioning vs ON CONFLICT

2017-03-31 Thread Peter Geoghegan
t the presence or absence of unique indexes within or across partitions. It might be sloppy for an application developer to do a whole lot of this, but that's not a judgement I think we can make for them. I don't feel strongly about this, though. -- Peter Geoghegan -- Sent via

Re: [HACKERS] Allow to specify #columns in heap/index_form_tuple

2017-03-31 Thread Peter Geoghegan
On Mar 31, 2017 2:17 PM, "Peter Geoghegan" wrote: On Fri, Mar 31, 2017 at 2:11 PM, Tom Lane wrote: >> The patch does actually store truncated/key-only tuples in the hi keys / >> non-leaf-nodes, which don't need the "included" columns. > > Hm. Since in

Re: [HACKERS] Allow to specify #columns in heap/index_form_tuple

2017-03-31 Thread Peter Geoghegan
the same way as suffix truncation would. I see no conflict there. Quite the opposite, in fact. -- Peter Geoghegan (Sent from my phone)

Re: [HACKERS] Allow to specify #columns in heap/index_form_tuple

2017-03-31 Thread Peter Geoghegan
nostic tools at all, despite being its own special case without real Datum values. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Allow to specify #columns in heap/index_form_tuple

2017-03-31 Thread Peter Geoghegan
que, that Anastasia's patch happens to be a special case of. It's a technique that is almost as old as B-Trees themselves. The use of a dedicated bit probably wouldn't be necessary, but perhaps it makes things safer for the patch. -- Peter Geoghegan -- Sent via pgsql-ha

Re: [HACKERS] Allow to specify #columns in heap/index_form_tuple

2017-03-31 Thread Peter Geoghegan
ve implemented a rough prototype of suffix truncation, that seems to work well enough, and keeps amcheck happy, and I base my remarks on the experience of writing that prototype. Using the NULL bitmap this way was the first thing I tried. -- Peter Geoghegan -- Sent via pgsql-hackers mailing lis

Re: [HACKERS] WIP: [[Parallel] Shared] Hash

2017-03-26 Thread Peter Geoghegan
f separating different classes of storage > (fast/expensive, slow/cheap etc), not as an IO or space striping > technique. I agree. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: [[Parallel] Shared] Hash

2017-03-26 Thread Peter Geoghegan
able, and disk capacity is the main concern. PHJ uses one temp tablespace per worker, which I further suppose might not be as effective in balancing disk space usage. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: [[Parallel] Shared] Hash

2017-03-25 Thread Peter Geoghegan
On Sat, Mar 25, 2017 at 7:56 PM, Thomas Munro wrote: > On Sun, Mar 26, 2017 at 1:53 PM, Peter Geoghegan wrote: >> ISTM that your patch now shares a quality with parallel tuplesort: You >> may now hold files open after an unlink() of the original link/path >> that they

Re: [HACKERS] Remove !isTemp buffile.c code

2017-03-25 Thread Peter Geoghegan
a good idea, but then I suggested it originally. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: [[Parallel] Shared] Hash

2017-03-25 Thread Peter Geoghegan
che LRU maintenance, which would fix this problem for parallel tuplesort, I suppose. That may not be workable for PHJ, because PHJ would probably need to hold on to such a "pin" for much longer, owing to the lack of any explicit "handover" phase. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch: Write Amplification Reduction Method (WARM)

2017-03-25 Thread Peter Geoghegan
). How will that interact with types like numeric, that have display scale or similar? -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] ICU integration

2017-03-24 Thread Peter Geoghegan
d for when Postgres installations are binary compatible. There should be a simple tool. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Fetch JSONB Value for UNIQUE Constraint

2017-03-24 Thread Peter Geoghegan
On Fri, Mar 24, 2017 at 4:57 PM, David E. Wheeler wrote: > So it’s a fine workaround, but maybe there’s something missing from the > parsing of the CREATE TABLE statement? This is on 9.6.1. Unique constraints don't support expressions, or a predicate (partial-ness). -- Pete

Re: [HACKERS] WIP: [[Parallel] Shared] Hash

2017-03-22 Thread Peter Geoghegan
bout what is permissible and not permissible in this area, and do something with pgstat_report_tempfile(). This is a bit like the unlink()-ENOENT/-to-terminate (ENOENT ignore) issue. There are no really hard questions here, but there certainly are some awkward questions. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)

2017-03-22 Thread Peter Geoghegan
il. >> My patch demonstrably has these properties. I've done quite a bit of >> fault injection testing to prove it. > > I don't understand this comment, because 0 of the 3 properties that I > just articulated are things which can be proved or disproved by fault > inje

Re: [HACKERS] WIP: [[Parallel] Shared] Hash

2017-03-21 Thread Peter Geoghegan
On Tue, Mar 21, 2017 at 7:18 PM, Peter Geoghegan wrote: >> As shown in 0008-hj-shared-buf-file-v8.patch. Thoughts? > > A less serious issue I've also noticed is that you add palloc() calls, > implicitly using the current memory context, within buffile.c. > BufFileOpe

Re: [HACKERS] WIP: [[Parallel] Shared] Hash

2017-03-21 Thread Peter Geoghegan
ared-buf-file-v8.patch. Thoughts? A less serious issue I've also noticed is that you add palloc() calls, implicitly using the current memory context, within buffile.c. BufFileOpenTagged() has some, for example. However, there is a note that we don't need to save the memory context

Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)

2017-03-21 Thread Peter Geoghegan
d necessarily so. I will still have some more feedback on your shared BufFile design, though, while it's fresh in my mind. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GUC for cleanup indexes threshold.

2017-03-21 Thread Peter Geoghegan
stored in dead B-Tree + SP-GiST pages that is used in the subsequent RecentGlobalXmin interlock that determines if recycling is safe (if there is no possible index scan that could land on the dead page). You know, the _bt_page_recyclable() check. -- Peter Geoghegan -- Sent via pgsql-hackers m

Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)

2017-03-21 Thread Peter Geoghegan
BufFile without the risk that some other process > will try to unlink() the file while it's still in use. The point for > me isn't so much whether unlink() ever ignores errors as whether > cleanup (however defined) is an operation guaranteed to happen exactly > once. My pat

Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)

2017-03-21 Thread Peter Geoghegan
's part of it, I guess, but it's more that the code you've added > to do parallelism here looks an awful lot like what's gotten added to > do parallelism in other cases, like parallel query. That's probably a > good sign. It's also a good sign that it makes CREA

Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)

2017-03-21 Thread Peter Geoghegan
echanism will prove to be buggy. Following a quick look at the latest PHJ patch series, and its 0008-hj-shared-buf-file-v8.patch file, I already see one example. I notice that there could be multiple calls to pgstat_report_tempfile() within each backend for the same BufFile segment. Isn't that counting the same thing more than once? In general, it seems problematic that there is now "true" fd.c temp segments, as well as Shared BufFile temp segments that are never in a backend resource manager. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch: Write Amplification Reduction Method (WARM)

2017-03-21 Thread Peter Geoghegan
lf into the corner. Are we really saying that there can be no incompatible change to the on-disk representation for the rest of eternity? I can see why that's something to avoid indefinitely, but I wouldn't like to rule it out. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (p

Re: [HACKERS] Patch: Write Amplification Reduction Method (WARM)

2017-03-20 Thread Peter Geoghegan
ow that the heap TIDs that are WARM root pointers are not going to be recycled in the lifetime of the amcheck query such that you get a false positive. A WARM check seems like a neat adjunct to what amcheck does already. It seems like a really good idea for WARM to buy into this kind of verificati

Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)

2017-03-19 Thread Peter Geoghegan
On Sun, Mar 12, 2017 at 3:05 PM, Peter Geoghegan wrote: > I attach my V9 of the patch. I came up some stuff for the design of > resource management that I think meets every design goal that we have > for shared/unified BufFiles: Commit 2609e91fc broke the parallel CREATE INDEX cost

Re: [HACKERS] [PATCH] SortSupport for macaddr type

2017-03-19 Thread Peter Geoghegan
er varlenas, because tuplesort has detected that that happens to be generally safe. I doubt that I'll ever get around to posting a patch to do that, since the cost savings are probably still marginal. I could probably find something better to work on. -- Peter Geoghegan -- Sent via

Re: [HACKERS] [PATCH] SortSupport for macaddr type

2017-03-18 Thread Peter Geoghegan
On Sat, Mar 18, 2017 at 2:54 PM, Peter Geoghegan wrote: > This seems fine to me, especially > because it lets us compare macaddr using simple 3-way unsigned int > comparisons, which isn't otherwise the case. Out of idle curiosity, I decided to generate disassembly of both macadd

Re: [HACKERS] [PATCH] SortSupport for macaddr type

2017-03-18 Thread Peter Geoghegan
ation, or if it aborted, then the index_getattr() can happen once per SortTuple, up-front. Nitpick: the patch should probably not refer to 32-bit or 64-bit systems. Rather, it should refer to Datum size only. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel Append implementation

2017-03-17 Thread Peter Geoghegan
ed list sort, but it seems we don't > have. Will do the qsort now since it would be faster. relcache.c does an insertion sort with a list of OIDs. See insert_ordered_oid(). -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to you

Re: [HACKERS] Review: GIN non-intrusive vacuum of posting tree

2017-03-16 Thread Peter Geoghegan
rd to make merging of pages that are not completely empty work, while also using the L&Y algorithm. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GUC for cleanup indexes threshold.

2017-03-14 Thread Peter Geoghegan
On Tue, Mar 14, 2017 at 3:10 PM, Peter Geoghegan wrote: > We already have BTPageOpaqueData.btpo, a union whose contained type > varies based on the page being dead. We could just do the same with > some other field in that struct, and then store epoch there. Clearly > nobody really

Re: [HACKERS] GUC for cleanup indexes threshold.

2017-03-14 Thread Peter Geoghegan
On Tue, Mar 14, 2017 at 2:48 PM, Peter Geoghegan wrote: > I think that that's safe, but it is a little disappointing that it > does not allow us to skip work in the case that you really had in mind > when writing the patch. Better than nothing, though, and perhaps still > a go

Re: [HACKERS] GUC for cleanup indexes threshold.

2017-03-14 Thread Peter Geoghegan
autovauum recycles that > page while index vacuuming(lazy_index_vacuum). Sorry for the delay in getting back to you. I think that that's safe, but it is a little disappointing that it does not allow us to skip work in the case that you really had in mind when writing the patch. Bette

Re: [HACKERS] Patch: Write Amplification Reduction Method (WARM)

2017-03-14 Thread Peter Geoghegan
without changing the user-visible interface. It seems pretty complementary to what is already there. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] tuplesort_gettuple_common() and *should_free argument

2017-03-13 Thread Peter Geoghegan
On Wed, Jan 25, 2017 at 3:11 PM, Peter Geoghegan wrote: > On Wed, Jan 25, 2017 at 3:11 PM, Tom Lane wrote: >> Please. You might want to hit the existing ones with a separate patch, >> but it doesn't much matter; I'd be just as happy with a patch that did >> both t

Re: [HACKERS] tuplesort_gettuple_common() and *should_free argument

2017-03-13 Thread Peter Geoghegan
t, because ltsReadBlock() could be involved instead. I don't remember the exact details offhand, so I will have to look into it again. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] tuplesort_gettuple_common() and *should_free argument

2017-03-13 Thread Peter Geoghegan
explanation > is is posted by 2017-03-16 AoE I will mark this submission > "Returned with Feedback". Apologies for the delay on this. I intend to get back to it before that time. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To mak

Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)

2017-03-12 Thread Peter Geoghegan
On Sun, Mar 12, 2017 at 3:05 PM, Peter Geoghegan wrote: > There is still an open item here, though: The leader-as-worker > Tuplesortstate, a special case, can still leak files. I phrased this badly. What I mean is that there can be instances where temp files are left on disk following a f

Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)

2017-03-12 Thread Peter Geoghegan
On Thu, Feb 16, 2017 at 8:45 AM, Peter Geoghegan wrote: >> I do not think there should be any reason why we can't get the >> resource accounting exactly correct here. If a single backend manages >> to remove every temporary file that it creates exactly once (and >>

Re: [HACKERS] [PATCH] Use $ parameters as replacement characters for pg_stat_statements

2017-03-09 Thread Peter Geoghegan
ay to expose the hash value in pg_stat_activity like that is above my pay grade, as Tom would say. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] amcheck (B-Tree integrity checking tool)

2017-03-09 Thread Peter Geoghegan
On Thu, Mar 9, 2017 at 7:12 PM, Peter Geoghegan wrote: >> Hm - I think it's fair to export RecentGlobalXmin, given that we >> obviously use it across modules in core code. I see very little reason >> not to export it. > > Well, the assertion is completely useless

Re: [HACKERS] amcheck (B-Tree integrity checking tool)

2017-03-09 Thread Peter Geoghegan
to export it. Well, the assertion is completely useless as anything but documentation... -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] amcheck (B-Tree integrity checking tool)

2017-03-09 Thread Peter Geoghegan
ildenv\HEAD\pgsql.build\amcheck.vcxproj] Rather than marking RecentGlobalXmin as PGDLLIMPORT, I'd rather just remove the documenting assertion and leave that comment as-is. I'll work on a patch for this soon. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql

Re: [HACKERS] on_dsm_detach() callback and parallel tuplesort BufFile resource management

2017-03-09 Thread Peter Geoghegan
owner clean-up of BufFiles. Otherwise, somebody might get in big trouble if they called BufFileClose() or something in an error path. Arguably, the reliance on ordering already exists today. I'm not saying that that's a good plan, or even an acceptable trade-off. Pick your poison. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] on_dsm_detach() callback and parallel tuplesort BufFile resource management

2017-03-09 Thread Peter Geoghegan
On Thu, Mar 9, 2017 at 4:29 PM, Thomas Munro wrote: > On Fri, Mar 10, 2017 at 8:19 AM, Peter Geoghegan wrote: >> by having state for each segment, it ends up actually *relying on* >> ENOENT-on-unlink() as a condition that terminates execution: > > Yeah, this seems to fall o

Re: [HACKERS] amcheck (B-Tree integrity checking tool)

2017-03-09 Thread Peter Geoghegan
mas Munro, Anastasia Lubennikova, Robert Haas, Amit Langote" Thanks for your help with this! -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Cost model for parallel CREATE INDEX

2017-03-09 Thread Peter Geoghegan
't until some contrary evidence emerges. > > I mean, sometimes it is clear that you are going to need special > handling someplace, and then you have to do it. But I don't see that > this is one of those cases, necessarily. That's what I'll do, then. -- Peter Geog

Re: [HACKERS] WIP: [[Parallel] Shared] Hash

2017-03-09 Thread Peter Geoghegan
ent" thread. They still apply to this latest version of the patch series. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] on_dsm_detach() callback and parallel tuplesort BufFile resource management

2017-03-09 Thread Peter Geoghegan
On Thu, Mar 9, 2017 at 11:19 AM, Peter Geoghegan wrote: > That patch seems to be solving the problem by completely taking over > management of temp files from fd.c. That is, these temp files are not > marked as temp files in the way ordinary temp BufFiles are, with > explicit

Re: [HACKERS] on_dsm_detach() callback and parallel tuplesort BufFile resource management

2017-03-09 Thread Peter Geoghegan
ly advantage I immediately see with the approach 0007-hj-shared-buf-file-v6.patch takes (over what I've provisionally written as my V9) is that by putting everything in shared memory all along, there is no weirdness with tying local memory clean-up to a shared memory on_dsm_detach() callback. As

Re: [HACKERS] Cost model for parallel CREATE INDEX

2017-03-08 Thread Peter Geoghegan
testing is in. There are a number of options, none of which are difficult to write code for. The hard part is determining what makes most sense for users on balance. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GUC for cleanup indexes threshold.

2017-03-07 Thread Peter Geoghegan
in the course of anti-wraparound VACUUM, even if VACUUM has no garbage tuples to kill (even if we only do lazy_cleanup_index() instead of lazy_vacuum_index()). This is the case that this patch proposes to have us skip touching indexes for. -- Peter Geoghegan -- Sent via pgsql-hackers maili

Re: [HACKERS] amcheck (B-Tree integrity checking tool)

2017-03-06 Thread Peter Geoghegan
most an alternative interface to the same functionality today. There can be another one in the future, if it serves a purpose, and the locking requirements are roughly the same for all the checks. I'd be fine with that. Let's just get the basic feature in for now, though. -- Peter Geogheg

[HACKERS] on_dsm_detach() callback and parallel tuplesort BufFile resource management

2017-03-06 Thread Peter Geoghegan
will end up being specialized to the problem he is trying to solve. I'm also slightly tempted to hard-code BufFiles as a new type of resource that a resource manager can take ownership of, but that also seems unappealing. What I've come up with may be as robust as anything will be for p

Re: [HACKERS] Cost model for parallel CREATE INDEX

2017-03-05 Thread Peter Geoghegan
On Sat, Mar 4, 2017 at 2:15 PM, Peter Geoghegan wrote: > So, I agree with Robert that we should actually use heap size for the > main, initial determination of # of workers to use, but we still need > to estimate the size of the final index [1], to let the cost model cap >

Re: [HACKERS] REINDEX CONCURRENTLY 2.0

2017-03-05 Thread Peter Geoghegan
ls are arcane such that it might as well be that simple most of the time. Even if you have time to listen to me explain it all, which you clearly don't, you're still probably not going to be able to apply what you've learned in a way that helps you. -- Peter Geoghegan -- Sent via

Re: [HACKERS] Cost model for parallel CREATE INDEX

2017-03-04 Thread Peter Geoghegan
inal index [1], to let the cost model cap the initial determination when maintenance_work_mem is just too low. (This cap will rarely be applied in practice, as I said.) [1] https://wiki.postgresql.org/wiki/Parallel_External_Sort#bt_estimated_nblocks.28.29_function_in_pageinspect -- Peter Geoghegan

Re: [HACKERS] [PATCH] Use $ parameters as replacement characters for pg_stat_statements

2017-03-04 Thread Peter Geoghegan
o maximize the chances of that happening, but it's still generally quite possible (e.g. pg_stat_statements never swaps constants in a query like "SELECT 5, pg_stat_statements_reset()"). This means that we cannot really say that this buys us a machine-readable query text format, at least no

Re: [HACKERS] Cost model for parallel CREATE INDEX

2017-03-04 Thread Peter Geoghegan
index size, not table size. I can change it to be table size, based on what you said. But the workMem related cap, which probably won't end up being applied all that often in practice, *should* still do something with projected index size, since that really is what we're sorting, whic

Re: [HACKERS] Cost model for parallel CREATE INDEX

2017-03-04 Thread Peter Geoghegan
stimate. I don't really know what minimum amount of memory to insist workers have, which is why I provisionally chose one of those GUCs as the threshold. Any better ideas? -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subsc

Re: [HACKERS] Cost model for parallel CREATE INDEX

2017-03-04 Thread Peter Geoghegan
dex size than current heap size. I agree with everything else you've said, I think. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 2017-03 Commitfest In Progress

2017-03-03 Thread Peter Geoghegan
> reviewing the aspects of it that touch on whether parallelism is being > done right, but I would like to have some help on the sorting end of > things. Your covering those aspects seems like something that would make this an easier sell to another reviewer. Thanks! -- Peter Geoghe

Re: [HACKERS] GUC for cleanup indexes threshold.

2017-03-03 Thread Peter Geoghegan
class.relfrozenxid/pg_database.datfrozenxid are advanced past opaque->btpo.xact. While I can't see this explained anywhere, I'm pretty sure that that's supposed to be impossible, which this patch changes. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hacke

Re: [HACKERS] GUC for cleanup indexes threshold.

2017-03-03 Thread Peter Geoghegan
On Fri, Mar 3, 2017 at 2:41 PM, Peter Geoghegan wrote: > In other words, the number of B-Tree pages that the last VACUUM > deleted, and thus made eligible to recycle by the next VACUUM has no > relationship with the number of pages the next VACUUM will itself end > up deleting, in gen

Re: [HACKERS] Skip all-visible pages during second HeapScan of CIC

2017-03-03 Thread Peter Geoghegan
lock on the heap relation (i.e. vacuuming it) after the first CIC transaction ends, but before the second CIC transaction begins? -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Skip all-visible pages during second HeapScan of CIC

2017-03-03 Thread Peter Geoghegan
rrectness of CIC - a relatively infrequent operation - on the > assumption that no VM bits can be set concurrenty due to the SUE lock. I agree. FWIW, the extra time that CIC takes over a plain CI is much reduced these days. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-ha

Re: [HACKERS] GUC for cleanup indexes threshold.

2017-03-03 Thread Peter Geoghegan
On Fri, Mar 3, 2017 at 11:49 AM, Peter Geoghegan wrote: >> Please verify my understanding of your thought process: We don't have >> to freeze indexes at all, ever, so if we see index bloat as a separate >> problem, we also see that there is no need to *link* index n

Re: [HACKERS] GUC for cleanup indexes threshold.

2017-03-03 Thread Peter Geoghegan
On Fri, Mar 3, 2017 at 11:37 AM, Peter Geoghegan wrote: > Please verify my understanding of your thought process: We don't have > to freeze indexes at all, ever, so if we see index bloat as a separate > problem, we also see that there is no need to *link* index needs to > the

Re: [HACKERS] GUC for cleanup indexes threshold.

2017-03-03 Thread Peter Geoghegan
a range of values for each index, a little like a BRIN index build. This range is what you go on to use to do a cheap index-scan-based B-Tree VACUUM. This could have far far less I/O, though has obvious risks that we need to worry about. That's work for another release, of course. -- Pete

Re: [HACKERS] Statement-level rollback

2017-03-03 Thread Peter Geoghegan
On Fri, Mar 3, 2017 at 9:01 AM, Andres Freund wrote: > On 2017-03-03 11:54:06 -0500, David Steele wrote: >> Given that this landed on March 28 with no discussion beforehand, I >> recommend that we immediately move this patch to the 2017-07 CF. > > Seconded. +1 -- Peter

Re: [HACKERS] Cleanup: avoid direct use of ip_posid/ip_blkid

2017-03-02 Thread Peter Geoghegan
at is invalid in some sense, even if it isn't actually set to InvalidOffsetNumber. So, seems pretty risky to me. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Cost model for parallel CREATE INDEX

2017-03-02 Thread Peter Geoghegan
On Thu, Mar 2, 2017 at 5:50 AM, Robert Haas wrote: > On Wed, Mar 1, 2017 at 12:58 AM, Peter Geoghegan wrote: >> * This scales based on output size (projected index size), not input >> size (heap scan input). Apparently, that's what we always do right >> now. > >

[HACKERS] Cost model for parallel CREATE INDEX

2017-02-28 Thread Peter Geoghegan
l.org/wiki/Parallel_External_Sort#bt_estimated_nblocks.28.29_function_in_pageinspect [3] https://www.postgresql.org/message-id/CAMkU=1y_qp+QUPGk=JBJSTtcYQpW2k=v2lmytzko_8ftuuy...@mail.gmail.com [4] https://www.postgresql.org/message-id/cam3swzr6c+1cwghc40g9z5thfe3u2xbv55w5-tertfeooaz...@mail.gmail.com -- Peter Geo

Re: [HACKERS] Polyphase merge is obsolete

2017-02-27 Thread Peter Geoghegan
On Mon, Jan 16, 2017 at 5:56 PM, Peter Geoghegan wrote: > On Wed, Oct 12, 2016 at 10:16 AM, Heikki Linnakangas wrote: >> The number of *input* tapes we can use in each merge pass is still limited, >> by the memory needed for the tape buffers and the merge heap, but only one &g

Re: [HACKERS] GUC for cleanup indexes threshold.

2017-02-27 Thread Peter Geoghegan
.0, but there might be non-linear increases in "the serious type of index bloat" as the proposed new setting was scaled up. I'd be much more worried about that. [1] https://archive.org/stream/symmetricconcurr00lani#page/6/mode/2up -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Should logtape.c blocks be of type long?

2017-02-26 Thread Peter Geoghegan
istent about a restriction like this, as Robert said. Given that fixing this issue will not affect the machine code generated by compilers for the majority of platforms we support, doing so seems entirely worthwhile to me. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-

Re: [HACKERS] Should logtape.c blocks be of type long?

2017-02-26 Thread Peter Geoghegan
s made more likely by the fact that we've made tuplesort faster in the past few releases (gains which the MAX_KILOBYTES restriction won't impinge on too much, particularly in Postgres 10). I find that unacceptable, at least for Postgres 10. -- Peter Geoghegan -- Sent via pgsql-hacke

[HACKERS] Should logtape.c blocks be of type long?

2017-02-25 Thread Peter Geoghegan
al one should assume that it is no wider than "int". This calls into question why any code that uses "long" didn't just use "int", at least in my mind. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GUC for cleanup indexes threshold.

2017-02-24 Thread Peter Geoghegan
putting *fully* dead B-Tree pages in the FSM for recycling. The interlock with RecentGlobalXmin is what makes it impossible for VACUUM to generally fully delete pages, *as well as* mark them as recyclable (put them in the FSM) all at once. Maybe you get this already, since, as I said, the terminolog

Re: [HACKERS] Poor memory context performance in large hash joins

2017-02-23 Thread Peter Geoghegan
method for memory contexts, it looks like you just reset the parent instead. > But I don't think that would work here. Are you aware of the fact that tuplesort.c got a second memory context for 9.6, entirely on performance grounds? -- Peter Geoghegan -- Sent via pgsql-hackers ma

Re: [HACKERS] Documentation improvements for partitioning

2017-02-23 Thread Peter Geoghegan
feature set, yet. Obviously that general principle is not under discussion. My point, of course, was that it seems pretty clear to me that this is on the right side of that fence. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Documentation improvements for partitioning

2017-02-23 Thread Peter Geoghegan
h I disagree with. There is nothing disappointing to me about this feature, and, as I said, I am unsurprised that it doesn't support certain things. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Documentation improvements for partitioning

2017-02-22 Thread Peter Geoghegan
t surprised at the limitations that this feature has, even if Bruce and Simon are. The documentation needs work, and perhaps the feature itself needs a small tweak here or there. Just not to a particularly notable degree, given the point we are in in the release cycle. -- Peter Geoghegan -- Sent

Re: [HACKERS] ICU integration

2017-02-20 Thread Peter Geoghegan
versions many different things, in fact. Importantly, it explicitly decouples behavioral issues (user visible sort order -- UCA version) from technical issues (collator implementation details). So, my original point is that that could change, and if that happens we ought to have a plan. But, it won'

Re: [HACKERS] ICU integration

2017-02-20 Thread Peter Geoghegan
xists. [1] http://site.icu-project.org/#TOC-Who-Uses-ICU- -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] ICU integration

2017-02-20 Thread Peter Geoghegan
problem, and users will never learn to deal with issues like this well when it is by definition something that should never happen. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GUC for cleanup indexes threshold.

2017-02-20 Thread Peter Geoghegan
entGlobalXmin respected by VACUUM, that prevents this sort of recycling. I suspect that the restrictions on page deletion as opposed to page recycling is vastly more likely to cause pain to users, and that's not made any worse by this. -- Peter Geoghegan -- Sent via pgsql-hackers maili

Re: [HACKERS] Index corruption with CREATE INDEX CONCURRENTLY

2017-02-17 Thread Peter Geoghegan
s that it will have checks for a large variety of invariants that involve the heap, and related SLRU structures such as MultiXacts. Though, that would probably necessitate code written by other people that are subject matter experts in areas that I am not. -- Peter Geoghegan -- Sent via pgsql-h

Re: [HACKERS] Index corruption with CREATE INDEX CONCURRENTLY

2017-02-17 Thread Peter Geoghegan
would need to visit the heap, which tends to be much larger than any one index, or even all indexes. That would probably need to be random I/O, too. It might be possible to mostly not visit the heap, though -- I'm not sure offhand. I'd have to study the problem in detail, which I have no t

Re: [HACKERS] Partitioning vs ON CONFLICT

2017-02-16 Thread Peter Geoghegan
cing this directly. IIRC, that's what happens with inheritance-based partitioning. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Partitioning vs ON CONFLICT

2017-02-16 Thread Peter Geoghegan
ON CONFLICT DO NOTHING permits, then it should be possible for it to just work today -- infer_arbiter_indexes() will return immediately. This should be just like the old approach involving inheritance, in that that should be possible. No? -- Peter Geoghegan -- Sent via pgsql-hackers mailing l

Re: [HACKERS] Avoiding OOM in a hash join with many duplicate inner keys

2017-02-16 Thread Peter Geoghegan
ge, and therefore this seems highly prone to worry even when > worrying isn't really justified. +1. ndistinct has a general tendency to be wrong, owing to how ANALYZE works, which we see problems with from time to time. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-h

Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)

2017-02-16 Thread Peter Geoghegan
e with random cases. > It's hard to speculate about this, but I guess that a significant > number of indexes in real world databases might be uncorrelated to > insert order. That would certainly be true with text, where we see a risk of (small) regressions. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)

2017-02-16 Thread Peter Geoghegan
On Thu, Feb 16, 2017 at 6:28 AM, Robert Haas wrote: > On Thu, Feb 9, 2017 at 7:10 PM, Peter Geoghegan wrote: >> At the risk of stating the obvious, ISTM that the right way to do >> this, at a high level, is to err on the side of unneeded extra >> unlink() calls, not leakin

<    1   2   3   4   5   6   7   8   9   10   >