Re: ON CONFLICT (and manual row locks) cause xmax of updated tuple to unnecessarily be set

2019-07-24 Thread Peter Geoghegan
> tuple locking - which makes sense, there was no cases where locks would > need to be carried forward. I agree that this is unfortunate. Are you planning on working on it? -- Peter Geoghegan

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2019-07-24 Thread Peter Geoghegan
On Wed, Jul 24, 2019 at 3:06 PM Peter Geoghegan wrote: > There seems to be a kind of "synergy" between the nbtsplitloc.c > handling of pages that have lots of duplicates and posting list > compression. It seems as if the former mechanism "sets up the bowling > pins&q

Re: ON CONFLICT (and manual row locks) cause xmax of updated tuple to unnecessarily be set

2019-07-25 Thread Peter Geoghegan
l, but that's the easy part. -- Peter Geoghegan

Re: PG 12 draft release notes

2019-07-25 Thread Peter Geoghegan
On Thu, Jul 25, 2019 at 6:37 PM Bruce Momjian wrote: > Attached patch applied, thanks. Thanks Bruce, -- Peter Geoghegan

Re: Patch for SortSupport implementation on inet/cdir

2019-07-25 Thread Peter Geoghegan
On Fri, Feb 8, 2019 at 11:13 PM Edmund Horner wrote: > I have some comments on the comments: Seems reasonable to me. Where are we on this? I'd like to get the patch committed soon. -- Peter Geoghegan

Re: Patch for SortSupport implementation on inet/cdir

2019-07-26 Thread Peter Geoghegan
rules about how the underlying types sort), or seemed to discuss things that were better discussed next to the relevant network_abbrev_convert() code. Thoughts? -- Peter Geoghegan v3-0001-Add-sort-support-for-inet-cidr-opfamily.patch Description: Binary data

Re: Patch for SortSupport implementation on inet/cdir

2019-07-26 Thread Peter Geoghegan
On Fri, Jul 26, 2019 at 6:58 PM Peter Geoghegan wrote: > I found this part of your approach confusing: > > > + /* > > +* Number of bits in subnet. e.g. An IPv4 that's /24 is 32 - 24 = 8. > > +* > > +* However, only some of the bits ma

Re: Testing LISTEN/NOTIFY more effectively

2019-07-27 Thread Peter Geoghegan
s needed to acquire advisory locks at just the right points during execution. If I had to guess, I'd guess that it had something to do with that. I might be able to come up with a better explanation if I saw the diff. -- Peter Geoghegan

Re: should there be a hard-limit on the number of transactions pending undo?

2019-07-29 Thread Peter Geoghegan
quote marks on now: "Perhaps we could > implement A later." I don't claim to have any real answers here. I don't claim to understand how much of a problem this is. [1] https://15721.courses.cs.cmu.edu/spring2016/papers/a16-graefe.pdf [2] http://db.cs.berkeley.edu/papers/fntdb07-architecture.pdf -- See "6.7 Standard Practice" -- Peter Geoghegan

Re: should there be a hard-limit on the number of transactions pending undo?

2019-07-29 Thread Peter Geoghegan
ing called "instantaneous transaction rollback", which seems to make SQL Server optionally behave a lot more like Postgres [1], apparently with many of the same disadvantages as Postgres. I agree that there is probably a middle way that more or less has the advantages of both approaches. I don't really know what that should look like, though. [1] https://www.microsoft.com/en-us/research/uploads/prod/2019/06/p700-antonopoulos.pdf -- Peter Geoghegan

Re: should there be a hard-limit on the number of transactions pending undo?

2019-07-29 Thread Peter Geoghegan
7;t need to impose restrictions on TID stability. Which seems to be why we offer such a large variety of index access methods -- it's relatively straight forward for Postgres to add niche index AMs, such as SP-GiST. -- Peter Geoghegan

Re: should there be a hard-limit on the number of transactions pending undo?

2019-07-29 Thread Peter Geoghegan
On Mon, Jul 29, 2019 at 12:39 PM Peter Geoghegan wrote: > I think that indexes (or at least B-Tree indexes) will ideally almost > always have tuples that are the latest versions with zheap. The > exception is tuples whose ghost bit is set, whose visibility varies > based on the MVCC

Re: should there be a hard-limit on the number of transactions pending undo?

2019-07-29 Thread Peter Geoghegan
ay, there seems to be pretty clearly a need for a > bit, but what does the bit mean? It could mean "please check the undo > log," in which case it'd have to be set on insert, eventually cleared, > and then reset on delete, but I think that's likely to suck. I think > therefore that the bit should mean > is-deleted-but-not-necessarily-all-visible-yet, which avoids that > problem. That sounds about right to me. -- Peter Geoghegan

Re: should there be a hard-limit on the number of transactions pending undo?

2019-07-29 Thread Peter Geoghegan
rs today). "Row forwarding" across heap pages is the traditional way of ensuring that TIDs in indexes are stable even in the worst case, apparently, but other approaches also seem possible. [1] http://www.vldb.org/pvldb/vol10/p781-Wu.pdf -- Peter Geoghegan

Re: pgbench - implement strict TPC-B benchmark

2019-07-30 Thread Peter Geoghegan
hy our traditional script > is not really TPC-B. That's treading on being false advertising. IANAL, but it may not even be permissible to claim that we have implemented "standard TPC-B". -- Peter Geoghegan

Re: Avoiding hash join batch explosions with extreme skew and weird stats

2019-07-30 Thread Peter Geoghegan
arly have very small BufFileWrite() size arguments. tuplestore.c, for one. -- Peter Geoghegan

Re: Patch for SortSupport implementation on inet/cdir

2019-07-31 Thread Peter Geoghegan
On Fri, Jul 26, 2019 at 7:25 PM Peter Geoghegan wrote: > I guess that the idea here was to prevent masking on ipv6 addresses, > though not on ipv4 addresses. Obviously we're only dealing with a > prefix with ipv6 addresses, whereas we usually have the whole raw > ipaddr with ipv4

Re: pgbench - implement strict TPC-B benchmark

2019-07-31 Thread Peter Geoghegan
to me. Not sure where that leaves this patch. What problem is it actually trying to solve? [1] http://www.tpc.org/tpcb/ -- Peter Geoghegan

Re: Patch for SortSupport implementation on inet/cdir

2019-08-01 Thread Peter Geoghegan
gs easier to > reason about. Pushed. Thanks! -- Peter Geoghegan

The unused_oids script should have a reminder to use the 8000-8999 OID range

2019-08-01 Thread Peter Geoghegan
ot use the reserved range? It seems preferable for everybody to consistently use the reserved OID range. -- Peter Geoghegan

Re: The unused_oids script should have a reminder to use the 8000-8999 OID range

2019-08-01 Thread Peter Geoghegan
same mail at > CAH2-WzmCzNMebiN4-8p=ON92m0Rz0ybxNEKrO_2J+9DqWfWP=a...@mail.gmail.com :) Seems like I should propose a patch this time around. I don't do Perl, but I suppose I could manage something as trivial as this. -- Peter Geoghegan

Re: The unused_oids script should have a reminder to use the 8000-8999 OID range

2019-08-02 Thread Peter Geoghegan
Suggested random unused OID: 9099 I would like to push this patch shortly. How do people feel about this wording? (It's based on the documentation added by commit a6417078.) -- Peter Geoghegan v2-0001-unused_oids-suggestion.patch Description: Binary data

Re: The unused_oids script should have a reminder to use the 8000-8999 OID range

2019-08-02 Thread Peter Geoghegan
hich implements your suggestion, generating output like the above. I haven't written a line of Perl in my life prior to today, so basic code review would be helpful. -- Peter Geoghegan v3-0001-unused_oids-suggestion.patch Description: Binary data

Re: The unused_oids script should have a reminder to use the 8000-8999 OID range

2019-08-02 Thread Peter Geoghegan
to be fairly unlucky to have that happen under the system introduced by commit a6417078.) It's probably the case that most patches that create a new pg_proc entry only create one. The question of consecutive OIDs only comes up with a fairly small number of patches. -- Peter Geoghegan

Re: The unused_oids script should have a reminder to use the 8000-8999 OID range

2019-08-02 Thread Peter Geoghegan
x27;s worst Perl programmer is no excuse.) How about the attached? I've simply removed the "if ($oid > $prev_oid + 2)" test. -- Peter Geoghegan v4-0001-unused_oids-suggestion.patch Description: Binary data

Re: Optimize single tuple fetch from nbtree index

2019-08-02 Thread Peter Geoghegan
inner side by buffering outer side tuples (say based on the "k=:val" constant) seems like it might generalize well enough. I suggest Floris look into that possibility. This paper might be worth a read: https://dl.acm.org/citation.cfm?id=582278 (Though it also might not be worth a read -- I haven't actually read it myself.) -- Peter Geoghegan

Re: The unused_oids script should have a reminder to use the 8000-8999 OID range

2019-08-02 Thread Peter Geoghegan
On Fri, Aug 2, 2019 at 3:52 PM Tom Lane wrote: > Better ... but I'm the world's second worst Perl programmer, > so I have little to say about whether it's idiomatic. Perhaps Michael can weigh in here? I'd rather hear a second opinion on v4 of the patch before proceeding. -- Peter Geoghegan

Re: Optimize single tuple fetch from nbtree index

2019-08-02 Thread Peter Geoghegan
On Fri, Aug 2, 2019 at 5:34 PM Peter Geoghegan wrote: > I wonder if some variety of block nested loop join would be helpful > here. I'm not aware of any specific design that would help with > Floris' case, but the idea of reducing the number of scans required on > the i

Re: The unused_oids script should have a reminder to use the 8000-8999 OID range

2019-08-03 Thread Peter Geoghegan
m unused_oids would *maximize* the number of OID collisions. > We could > recommend the range if there are at least 10 OIDs available in the > range from the lowest position, and there are few patches eating more > than 5-10 OIDs at once. That sounds like an over-engineered soluti

Re: The unused_oids script should have a reminder to use the 8000-8999 OID range

2019-08-05 Thread Peter Geoghegan
On Fri, Aug 2, 2019 at 1:28 PM Julien Rouhaud wrote: > I'm fine with it! Pushed a version with similar wording just now. Thanks! -- Peter Geoghegan

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2019-08-05 Thread Peter Geoghegan
n the same way that I mentioned unique index support is broken). I also added a nearby FIXME comment to _bt_insertonpg_in_posting() -- I don't think think that the code for splitting a posting list in two is currently crash-safe. How do you feel about officially calling this deduplication, not

Re: pg can create duplicated index without any errors even warnning

2019-08-05 Thread Peter Geoghegan
that has some disadvantages that you might want to avoid.) Questions like this are better suited to the pgsql-general list. -- Peter Geoghegan

Re: The unused_oids script should have a reminder to use the 8000-8999 OID range

2019-08-05 Thread Peter Geoghegan
is will ever be a real problem. Just try again. > Wouldn't it be better to keep some room at the end of the allowed > array? Or at least avoid suggesting ranges where there is less than > 3-5 OIDs available consecutively. Not in my view. There is value in having simple, predic

Re: The unused_oids script should have a reminder to use the 8000-8999 OID range

2019-08-05 Thread Peter Geoghegan
iguous OIDs (or whatever) far sooner than we'll run out of single OIDs. Now we have to worry about doing a second (actually a third) pass over the OIDs as a fallback when that happens. And so on. -- Peter Geoghegan

Re: The unused_oids script should have a reminder to use the 8000-8999 OID range

2019-08-05 Thread Peter Geoghegan
ontiguous OIDs in the range 8000-. It's just busy work. -- Peter Geoghegan

Re: How am I supposed to fix this?

2019-08-06 Thread Peter Geoghegan
; If this takes too long, you can always adjust the query to only verify system indexes or TOAST indexes. -- Peter Geoghegan

Re: How am I supposed to fix this?

2019-08-06 Thread Peter Geoghegan
d explicit type casts. It's a contrib extension, so you have to "create extension amcheck" first. -- Peter Geoghegan

Re: How am I supposed to fix this?

2019-08-06 Thread Peter Geoghegan
always keen to hear about how much the tooling helps in the real world. -- Peter Geoghegan

Re: Locale support

2019-08-08 Thread Peter Geoghegan
it is built on top of GNU units, which is itself highly extensible. I'm not sure if this will be useful, since I am not an expert on calendar systems. -- Peter Geoghegan

Shrinking tuplesort.c's SortTuple struct (Was: More ideas for speeding up sorting)

2019-08-09 Thread Peter Geoghegan
rting. Besides, sorting itself is the bottleneck for tuplesort-using operations less and less these days -- the only remaining interesting bottleneck is probably in code like index_form_tuple(), which is probably a good target for JIT. In general, it's much harder to make tuplesort.c noticeably faster than it used to be -- we've picked all the low-hanging fruit. -- Peter Geoghegan

Re: Shrinking tuplesort.c's SortTuple struct (Was: More ideas for speeding up sorting)

2019-08-10 Thread Peter Geoghegan
entation that was added to Java 7 [2]. It uses all of the same tricks as our existing the Bentley & McIlroy implementation, but is more cache efficient. It's considered the successor to B&M, and had input from Bentley himself. It is provably faster than B&M for a wide variety of inputs, at least on modern hardware. [1] http://www.vldb.org/journal/VLDBJ4/P603.pdf [2] https://codeblab.com/wp-content/uploads/2009/09/DualPivotQuicksort.pdf -- Peter Geoghegan

Re: Do not check unlogged indexes on standby

2019-08-12 Thread Peter Geoghegan
port of corruption when that happens, rather than letting an ambiguous "can't happen" error get raised by low-level code. This might be possible with system catalog corruption, for example. Finally, I thought that the WARNING was a bit strong -- a NOTICE is more appropriate. Thanks! -- Peter Geoghegan

Re: Do not check unlogged indexes on standby

2019-08-12 Thread Peter Geoghegan
uppose that bt_right_page_check_scankey() helps with transposed pages, but doesn't help so much when you have WAL-level inconsistencies. -- Peter Geoghegan

Re: Do not check unlogged indexes on standby

2019-08-13 Thread Peter Geoghegan
evel. We should not try to be too clever about ignorable/half-dead/deleted pages, to be conservative.) -- Peter Geoghegan

Re: Use PageIndexTupleOverwrite() within nbtsort.c

2019-08-13 Thread Peter Geoghegan
physically different page (even after masking within btree_mask()). However, I eventually decided that you had it right. Your _bt_mark_page_halfdead() change is clearer overall and doesn't break WAL consistency checking in practice, for reasons that are no less obvious than before. Thanks! -- Peter Geoghegan

Re: Improve search for missing parent downlinks in amcheck

2019-08-13 Thread Peter Geoghegan
B-Tree pages may be allocated, but those are always auxiliary (e.g., * they are current target's child pages). Conceptually, problems are only * ever found in the current target page (or for a particular heap tuple during * heapallindexed verification). Each page found by verification's left/right, * top/bottom scan becomes the target exactly once. */ -- Peter Geoghegan

Re: Removing unneeded downlink field from nbtree stack struct

2019-08-14 Thread Peter Geoghegan
cked). This seemed like something that was really up to the callers. Pushed a version with that change. Thanks for the review! -- Peter Geoghegan

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2019-08-19 Thread Peter Geoghegan
ure what you mean by this. I suppose that it doesn't matter, since we both prefer the alternative that you came up with anyway. > > How do you feel about officially calling this deduplication, not > > compression? I think that it's a more accurate name for the technique. > I agree. > Should I rename all related names of functions and variables in the patch? Please rename them when convenient. -- Peter Geoghegan

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2019-08-19 Thread Peter Geoghegan
; all regression tests, so I think it's ready for review again. I'm looking at it now. I'm going to spend a significant amount of time on this tomorrow. I think that we should start to think about efficient WAL-logging now. > In the meantime, I'll run more stress-tests. As you probably realize, wal_consistency_checking is a good thing to use with your tests here. -- Peter Geoghegan

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2019-08-22 Thread Peter Geoghegan
n order, but also have many duplicates. Removing the BT_COMPRESS_THRESHOLD stuff really helped with those indexes. Want me to send this data and the associated tests script over to you? -- Peter Geoghegan

Re: when the IndexScan reset to the next ScanKey for in operator

2019-08-22 Thread Peter Geoghegan
m/cah2-wzmrt_0ybhf05axqb2oituqiqakr0lznntj8x3kadkz...@mail.gmail.com -- Peter Geoghegan

Re: Optimize single tuple fetch from nbtree index

2019-08-23 Thread Peter Geoghegan
t -- at a minimum, this requires more documentation. This code is a few years old, but I still wouldn't be surprised if it turned out to be slightly wrong in a way that was important. We still have no way of detecting if a buffer is accessed without a pin. There have been numerous bugs like that before. (We have talked about teaching Valgrind to detect the case, but that never actually happened.) -- Peter Geoghegan

Re: pg11.5: ExecHashJoinNewBatch: glibc detected...double free or corruption (!prev)

2019-08-24 Thread Peter Geoghegan
hmee19@news-spur.riddles.org.uk That was a BufFile that was under the control of a tuplestore, so it was similar to but different from your case. I suspect it's related. -- Peter Geoghegan

Building infrastructure for B-Tree deduplication that recognizes when opclass equality is also equivalence

2019-08-25 Thread Peter Geoghegan
ific exceptions. In any case, I'm certain that problems like the btree/numeric display scale problem are simply not worth solving directly. That would add a huge amount of complexity for very little benefit. [1] https://commitfest.postgresql.org/24/2202/ -- Peter Geoghegan

"Classic" nbtree suffix truncation prototype

2019-08-25 Thread Peter Geoghegan
r of internal pages (e.g. a 5x+ reduction), along with a very small reduction in the number of leaf pages. Users that happen to have a lot of indexes that look like this are likely to find classic suffix truncation compelling, but that doesn't seem like a good enough reason to push ahead with the patch. -- Peter Geoghegan

Re: Building infrastructure for B-Tree deduplication that recognizes when opclass equality is also equivalence

2019-08-25 Thread Peter Geoghegan
On Sun, Aug 25, 2019 at 2:18 PM Peter Geoghegan wrote: > > Indeed, we run up against this sort of thing all the time in, eg, planner > > optimizations. I think some sort of "equality is precise" indicator > > would be really useful for a lot of things. > > Th

Re: Building infrastructure for B-Tree deduplication that recognizes when opclass equality is also equivalence

2019-08-25 Thread Peter Geoghegan
completely, because we're not directly concerned with the physical representation used within an index. In fact, a major goal for this new infrastructure is that nbtree gets to fully own the representation (it just needs to know about the high level or logical requirements). -- Peter Geoghegan

Re: Building infrastructure for B-Tree deduplication that recognizes when opclass equality is also equivalence

2019-08-25 Thread Peter Geoghegan
On Sun, Aug 25, 2019 at 2:55 PM Peter Geoghegan wrote: > I suppose that we'd add something new to CREATE OPERATOR CLASS to make > this work? My instinct is to avoid adding things that are only > meaningful for a single AM to interfaces like CREATE OPERATOR CLASS, > but the s

Re: Building infrastructure for B-Tree deduplication that recognizes when opclass equality is also equivalence

2019-08-25 Thread Peter Geoghegan
h the "C" collation isn't otherwise usable. Perhaps there are far more compelling planner optimization that I haven't considered, though. This idea probably has problems with interesting sort orders that aren't actually that interesting. -- Peter Geoghegan

IoT/sensor data and B-Tree page splits

2019-08-26 Thread Peter Geoghegan
s? What would a robust algorithm look like? Perhaps this is a problem that isn't worth solving right now, but it is definitely a real problem. [1] https://www.postgresql.org/message-id/66ce997fb523c04e9749452273184c6c137cb88...@exch-mbx-113.vmware.com -- Peter Geoghegan

Re: IoT/sensor data and B-Tree page splits

2019-08-26 Thread Peter Geoghegan
indexes have clear disadvantages, including the fact that you have to know that your data is amenable to BRIN indexing in order to use a BRIN index. -- Peter Geoghegan

Re: IoT/sensor data and B-Tree page splits

2019-08-26 Thread Peter Geoghegan
n Postgres 13, which would improve matters further with low cardinality indexes.) -- Peter Geoghegan

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2019-08-27 Thread Peter Geoghegan
anything that will be broken by that lie, because it doesn't care about the actual content of posting lists. And, we can fix the "fake new item is not actually real new item" issue at one point within _bt_split(), just as we're about to WAL log. What do you think of that approach? -- Peter Geoghegan

Re: Re: Email to hackers for test coverage

2019-08-28 Thread Peter Geoghegan
d is correct -- the NULL handling within ApplySortAbbrevFullComparator() cannot actually be used currently. I wouldn't change anything about the code, though, since it's useful to defensively handle NULLs. -- Peter Geoghegan

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2019-08-29 Thread Peter Geoghegan
at we see here is both logical and surprising. How do you feel about this CREATE INDEX index-size-is-larger business? -- Peter Geoghegan

Re: Yet another fast GiST build

2019-08-29 Thread Peter Geoghegan
uld have a lot of advantages in the long term. It is certainly theoretically appealing. Could this make it easier to use merge join with containment operators? I'm thinking of things like geospatial joins, which can generally only be performed as nested loop joins at the moment. This is often wildly inefficient. -- Peter Geoghegan

Re: Yet another fast GiST build

2019-08-29 Thread Peter Geoghegan
lues. We've prototyped that, see [1]. I'm pretty sure that spatial joins generally need two spatial indexes (usually R-Trees). There seems to have been quite a lot of research in it in the 1990s. -- Peter Geoghegan

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2019-08-29 Thread Peter Geoghegan
On Thu, Aug 29, 2019 at 5:07 PM Peter Geoghegan wrote: > I agree that v9 might be ever so slightly more space efficient than v5 > was, on balance. I see some Valgrind errors on v9, all of which look like the following two sample errors I go into below. First one: ==11193== VALGRINDERROR

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2019-08-31 Thread Peter Geoghegan
On Thu, Aug 29, 2019 at 10:10 PM Peter Geoghegan wrote: > I see some Valgrind errors on v9, all of which look like the following > two sample errors I go into below. I've found a fix for these Valgrind issues. It's a matter of making sure that _bt_truncate() sizes new pivot

Re: Contributing with code

2017-12-31 Thread Peter Geoghegan
portant parts of successfully contributing to PostgreSQL. It's usually essential to understand in detail why the thing that you're thinking of working on doesn't already exist. The TODO list seems to suggest almost the opposite, and as such is a trap for inexperienced hackers. -- Peter Geoghegan

Re: TODO list (was Re: Contributing with code)

2017-12-31 Thread Peter Geoghegan
ng it to a high standard then it probably would have happened already. The fact that it hasn't happened tells us plenty. -- Peter Geoghegan

Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)

2018-01-02 Thread Peter Geoghegan
an a week away from posting a patch that I'm going to mark "ready for committer". I've already made the change above, and once I spend time on trying to break the few small changes needed within buffile.c I'll have taken it as far as I can, most likely. -- Peter Geoghegan

Re: [HACKERS] GUC for cleanup indexes threshold.

2018-01-06 Thread Peter Geoghegan
fault Sawada-san for waiting to hear other people's views before proceeding. It really needs to be properly discussed. -- Peter Geoghegan

Re: [HACKERS] GUC for cleanup indexes threshold.

2018-01-06 Thread Peter Geoghegan
nvenient way of using L&S's technique to defer recycling until it is definitely safe. We only need to make sure that _bt_page_recyclable() cannot become confused by XID wraparound to fix this problem -- that's it. -- Peter Geoghegan

Re: Unimpressed with pg_attribute_always_inline

2018-01-08 Thread Peter Geoghegan
all the variables are. In my experience, inlining hurts > both of those things, which is why I'm saying that forcing inlining > even in non-optimized builds is a bad idea. Isn't that an argument against inlining in general, rather than forcing inlining in particular? -- Peter Geoghegan

Re: Unimpressed with pg_attribute_always_inline

2018-01-08 Thread Peter Geoghegan
oes, even though the goal of the usage is > just to override the compiler's inlining heuristics. Sorry, I found the way this was discussed confusing. Anyway, ISTM that it should be possible to make pg_attribute_always_inline have no effect in typical debug builds. Wouldn't that make everyone happy? -- Peter Geoghegan

Re: Unimpressed with pg_attribute_always_inline

2018-01-08 Thread Peter Geoghegan
bout inlining in some specific cases, and may therefore want to make inlining absolutely mandatory. IIUC, that's almost what we want, except that it also inlines with -O0, which we do not want. Have I missed the point here? -- Peter Geoghegan

Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)

2018-01-09 Thread Peter Geoghegan
hy external sorts can be faster than internal sorts -- this happens fairly frequently these days, especially with CREATE INDEX, where being able to write out the index as it merges on-the-fly helps a lot.) -- Peter Geoghegan

Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)

2018-01-10 Thread Peter Geoghegan
d the same number of input > + * tapes as workers. > > I can't interpret the word "leader-wise". A partition-wise join is a > join done one partition at a time, but a leader-wise logical tape set > is not done one leader at a time. If there's another meaning to the > affix -wise, I'm not familiar with it. Don't we just mean "a single > logical tapeset managed by the leader"? Yes, we do. Will change. > There's a lot here I haven't grokked yet, but I'm running out of > mental energy so I think I'll send this for now and work on this some > more when time permits, hopefully tomorrow. The good news is that the things that you took issue with were about what I expected you to take issue with. You seem to be getting through the review of this patch very efficiently. -- Peter Geoghegan

Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)

2018-01-10 Thread Peter Geoghegan
On Wed, Jan 10, 2018 at 1:31 PM, Robert Haas wrote: >> Can we actually call it max_parallel_maintenance_workers instead? >> I mean we don't have work_mem_maintenance. > > Good point. WFM. -- Peter Geoghegan

Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)

2018-01-10 Thread Peter Geoghegan
ase too, but much less so. That's really not why he did it. > I assume the author credit will be "Peter > Geoghegan, Rushabh Lathia" in that order, but let me know if anyone > thinks that isn't the right idea. "Peter Geoghegan, Rushabh Lathia" seems right.

Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)

2018-01-10 Thread Peter Geoghegan
the idea of having a leader-wise space following concatenating worker tapes (who have original/worker-wise space). We must apply an offset to get from a worker-wise offset to a leader-wise offset. This made more sense in an earlier version. I overlooked this during recent self review. -- Peter Geoghegan

Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)

2018-01-11 Thread Peter Geoghegan
nt about being able to test low memory conditions from the first commit is that insisting on it is reasonable. I don't actually feel strongly either way, though, and am not doing any insisting myself. -- Peter Geoghegan

Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)

2018-01-11 Thread Peter Geoghegan
On Thu, Jan 11, 2018 at 12:06 PM, Peter Geoghegan wrote: > It might make sense to have the "minimum memory per participant" value > come from a GUC, rather than be hard coded (it's currently hard-coded > to 32MB). > What do you think of that idea? A third opt

Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)

2018-01-11 Thread Peter Geoghegan
ome extent that's the expectation that has been established already. I am not far from posting a revision that incorporates all of your feedback. Expect that tomorrow afternoon your time at the latest. Of course, you may have more feedback for me in the meantime. Let me know if I should hold off on posting a new version. -- Peter Geoghegan

Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)

2018-01-11 Thread Peter Geoghegan
ture change you suggested. > I'm not confident I completely understand what's going on with the > logtape stuff yet, so I might have more comments (or better ones) > after I study this further. To your question about whether to go > ahead and post a new version, I'm OK to keep reviewing this version > for a little longer or to switch to a new one, as you prefer. I have > not made any local changes, just written a blizzard of email text. > :-p Great. Thanks. I've caught up with you again. I just need to take a look at what I came up with with fresh eyes, and maybe do some more testing. -- Peter Geoghegan

Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)

2018-01-12 Thread Peter Geoghegan
GetOldestXmin()'s second argument is true in the patch, rather than PROCARRAY_FLAGS_VACUUM. That's due to bitrot that was not caught during some previous rebase (commit af4b1a08 changed the signature). Will fix. You've given me a lot more to work through in your most recent mail, Rober

Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)

2018-01-13 Thread Peter Geoghegan
ine with killing force_parallel_mode, though, because it will be possible to force the use of parallelism by using the existing parallel_workers table storage param in the next version of the patch, regardless of how small the table is. Thanks for the review. -- Peter Geoghegan

Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)

2018-01-15 Thread Peter Geoghegan
On Sun, Jan 14, 2018 at 8:25 PM, Amit Kapila wrote: > On Sun, Jan 14, 2018 at 1:43 AM, Peter Geoghegan wrote: >> On Sat, Jan 13, 2018 at 4:32 AM, Amit Kapila wrote: >>> Yeah, but this would mean that now with parallel create index, it is >>> possible that some tuples

Re: let's not complain about harmless patch-apply failures

2018-01-16 Thread Peter Geoghegan
g problems mechanically. Rebasing a patch without conflicts (including seeing a warning about offsets) does not mean that your patch didn't become broken in some subtle, harmful way. Mechanical detection is only useful to the extent that it guides and augments human oversight. -- Peter Geoghegan

Re: let's not complain about harmless patch-apply failures

2018-01-16 Thread Peter Geoghegan
erally not helpful unless used thoughtfully. -- Peter Geoghegan

Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)

2018-01-17 Thread Peter Geoghegan
ogress, to make sure that what I add is comparable to what ultimately gets committed for parallel query. -- Peter Geoghegan

Re: PostgreSQL crashes with SIGSEGV

2018-01-17 Thread Peter Geoghegan
akes, and mostly posted it because I thought that it was a useful counterpoint. -- Peter Geoghegan

Re: PostgreSQL crashes with SIGSEGV

2018-01-17 Thread Peter Geoghegan
tuplesort_getdatum() subset fix might still be a good idea. I wonder where you stand on this. -- Peter Geoghegan

Re: PostgreSQL crashes with SIGSEGV

2018-01-17 Thread Peter Geoghegan
hat grouping sets is right (and that mode_final() is wrong). Do you? -- Peter Geoghegan

Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)

2018-01-17 Thread Peter Geoghegan
On Wed, Jan 17, 2018 at 10:27 AM, Robert Haas wrote: > On Wed, Jan 17, 2018 at 12:27 PM, Peter Geoghegan wrote: >> I think that both problems (the live _bt_parallel_scan_and_sort() bug, >> as well as the general issue with needing to account for parallel >> worker fork(

Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)

2018-01-17 Thread Peter Geoghegan
etic data distributions/types. >> WFM. Also added documentation for the wait events to monitoring.sgml, >> which I somehow missed the first time around. > > But you forgot to update the preceding "morerows" line, so the > formatting will be all messed up. Fixed. >> I removed "really". The point of the comment is that we've already set >> up temp tablespaces for the shared fileset in the parallel case. >> Shared filesets figure out which tablespaces will be used up-front -- >> see SharedFileSetInit(). > > So why not say it that way? i.e. For parallel sorts, this should have > been done already, but it doesn't matter if it gets done twice. Okay. > I don't see any reason not to make those contingent only on > trace_sort. The user can puzzle apart which messages are which from > the PIDs in the logfile. Okay. I have removed anything that restrains the verbosity of trace_sort for the WORKER() case. I think that you were right about it the first time, but I now think that this is going too far. I'm letting it go, though. -- Peter Geoghegan

Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)

2018-01-17 Thread Peter Geoghegan
oesn't really > matter: if that comes along later, it will be trivial to adjust the > code to take advantage of it. Okay. I'll work on adopting dynamic barriers in the way you described. I just wanted to make sure that we're all on the same page about what that looks like. -- Peter Geoghegan

Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)

2018-01-18 Thread Peter Geoghegan
parallel_leader_participation=off allow a degenerate parallel CREATE INDEX in the next version. I think that it will make parallel_leader_participation less useful, with no upside, but there doesn't seem to be much more that I can do about that. -- Peter Geoghegan

Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)

2018-01-18 Thread Peter Geoghegan
out the question of ripping out parallel_leader_participation entirely. > Can you please elaborate what part of optimizer are you talking about > where without leader participation partial path will always lose to a > serial sequential scan path? See my remarks to Robert just now. -- Peter Geoghegan

<    3   4   5   6   7   8   9   10   11   12   >