Re: [HACKERS] modeling parallel contention (was: Parallel Append implementation)

2017-08-05 Thread Peter Geoghegan
o a worker at once within the parallel heap scan that feeds workers. The leader, which merges worker runs, may ultimately have to perform fewer comparisons as a result of this, which is where most of the benefit would be. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql

[HACKERS] ICU collation variant keywords and pg_collation entries (Was: [BUGS] Crash report for some ICU-52 (debian8) COLLATE and work_mem values)

2017-08-06 Thread Peter Geoghegan
On Sun, Aug 6, 2017 at 1:06 PM, Peter Geoghegan wrote: > On Sat, Aug 5, 2017 at 8:26 PM, Tom Lane wrote: >> I'm quite disturbed though that the set of installed collations on these >> two test cases seem to be entirely different both from each other and from >> wh

Re: [HACKERS] max_files_per_processes vs others uses of file descriptors

2017-08-07 Thread Peter Geoghegan
this specifically about postgres_fdw, or is there some other specific problem you have in mind, that this would help solve? -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] ICU collation variant keywords and pg_collation entries (Was: [BUGS] Crash report for some ICU-52 (debian8) COLLATE and work_mem values)

2017-08-07 Thread Peter Geoghegan
On Mon, Aug 7, 2017 at 2:50 PM, Peter Eisentraut wrote: > On 8/6/17 20:07, Peter Geoghegan wrote: >> I've looked into this. I'll give an example of what keyword variants >> there are for Greek, and then discuss what I think each is. > > I'm not sure why we

Re: [HACKERS] ICU collation variant keywords and pg_collation entries (Was: [BUGS] Crash report for some ICU-52 (debian8) COLLATE and work_mem values)

2017-08-07 Thread Peter Geoghegan
COLLATION If it's mandatory for get_icu_language_tag() to not throw an error during initdb import when passed strings like these (that are generated mechanically), why should we not do the same with CREATE COLLATION? While the choice to preserve BCP 47's tolerance of missing collations is

Re: [HACKERS] Possible issue with expanded object infrastructure on Postgres 9.6.1

2017-08-08 Thread Peter Geoghegan
On Thu, Jan 19, 2017 at 5:45 PM, Peter Geoghegan wrote: > A customer is on 9.6.1, and complains of a segfault observed at least > 3 times. > I can use GDB to get details of the instruction pointer that appeared > in the kernel trap error, which shows a function from the expa

Re: [HACKERS] How can I find a specific collation in pg_collation when using ICU?

2017-08-09 Thread Peter Geoghegan
unlike Glibc. This will give more useful results when sorting Japanese. The best explanation of the difference that I can understand is here, under "Why do CJK strings sort incorrectly in Unicode?": https://dev.mysql.com/doc/refman/5.5/en/faqs-cjk.html -- Peter Geoghegan -- Sent vi

[HACKERS] What users can do with custom ICU collations in Postgres 10

2017-08-09 Thread Peter Geoghegan
s can also have numbers sort like numbers should when compared against other numbers, by using the numericOrdering option (not shown). numericOrdering would be great for things like alphanumeric invoice numbers, or the alphanumeric car registration plate numbers that are used in certain count

Re: [HACKERS] How can I find a specific collation in pg_collation when using ICU?

2017-08-09 Thread Peter Geoghegan
riant options (e.g., traditional Spanish sort order, alternative Japanese sort order, pictographic emoji sorting), with some further generic options for varying how case it handled, how numbers are handled, and other things like that. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsq

Re: [HACKERS] Getting server crash on Windows when using ICU collation

2017-08-10 Thread Peter Geoghegan
g configure test, replacing it with something that will work just the same on Windows, presuming Peter also reverts the commit that had ICU never use ucol_strcollUTF8() on Windows. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make chan

Re: [HACKERS] Getting server crash on Windows when using ICU collation

2017-08-10 Thread Peter Geoghegan
On Thu, Aug 10, 2017 at 3:57 PM, Peter Geoghegan wrote: > On Thu, Aug 10, 2017 at 11:02 AM, Robert Haas wrote: >> On Fri, Jun 23, 2017 at 1:14 AM, Ashutosh Sharma >> wrote: >>> Okay, attached is the patch which first detects the platform type and >>> runs the

Re: [HACKERS] Thoughts on unit testing?

2017-08-13 Thread Peter Geoghegan
quires buy-in from patch authors to work. To put it another way, a patch author whose patch touches storage implicitly asserts that their patch is correct. It would be useful if they provided a precise falsifiable statement about its correctness up front, in the form of verification code. -- Peter

Re: [HACKERS] [BUGS] Replication to Postgres 10 on Windows is broken

2017-08-13 Thread Peter Geoghegan
e were wrong to disable the use of strxfrm() for abbreviated keys? I think that it's useful for these things to be handled in an adversarial manner, in the same way that litigation is adversarial in a common law court. I doubt that Noah actually set out to demoralize anyone. He is just doing the job

Re: [HACKERS] [BUGS] Replication to Postgres 10 on Windows is broken

2017-08-13 Thread Peter Geoghegan
On Sun, Aug 13, 2017 at 2:22 PM, Andres Freund wrote: > On 2017-08-13 16:55:33 -0400, Tom Lane wrote: >> Peter Geoghegan writes: >> > I think that it's useful for these things to be handled in an >> > adversarial manner, in the same way that litigation is advers

Re: [HACKERS] What users can do with custom ICU collations in Postgres 10

2017-08-14 Thread Peter Geoghegan
an get the "co" variants there. Should be for the most part obvious which one is interesting to which locale, since there is not that many "co" variants to choose from, and users will probably know what to look for if they look at all. -- Peter Geoghegan -- Sent via pg

Re: [HACKERS] INSERT .. ON CONFLICT DO SELECT [FOR ..]

2017-08-14 Thread Peter Geoghegan
O SELECT to raise a cardinality violation error? Why or why not? -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] What users can do with custom ICU collations in Postgres 10

2017-08-15 Thread Peter Geoghegan
options to choose from, and I think that for the most part it's reasonably obvious which one is desirable. For example, Chinese people are probably well aware of what Pinyin is, and what stroke is. Things like EOR and search are much more esoteric, but also much less useful. So, I wouldn

Re: [HACKERS] What users can do with custom ICU collations in Postgres 10

2017-08-15 Thread Peter Geoghegan
On Tue, Aug 15, 2017 at 11:33 AM, Peter Eisentraut wrote: > On 8/9/17 18:49, Peter Geoghegan wrote: >> I'd like to give a demo on what is already possible, but not currently >> documented. I didn't see anyone else comment on this, including Peter >> E (maybe I misse

Re: [HACKERS] Atomics for heap_parallelscan_nextpage()

2017-08-16 Thread Peter Geoghegan
gt;> >> Not sure if this is your bug or if it's exposing a pre-existing >> deficiency in the atomics code, viz, failure to ensure that >> pg_atomic_uint64 is actually a 64-bit-aligned type. Andres? > > I suspect it's the former. Suspect that the shared memory that holds >

Re: [HACKERS] Re: ICU collation variant keywords and pg_collation entries (Was: [BUGS] Crash report for some ICU-52 (debian8) COLLATE and work_mem values)

2017-08-19 Thread Peter Geoghegan
verable, and to explain the capabilities of custom ICU collations more generally. [1] https://postgr.es/m/f67f36d7-ceb6-cfbd-28d4-413c6d22f...@2ndquadrant.com [2] https://postgr.es/m/3862d484-f0a5-9eef-c54e-3f6808338...@2ndquadrant.com -- Peter Geoghegan -- Sent via pgsql-hackers mailing lis

Re: [HACKERS] [PATCH] Incremental sort

2017-04-26 Thread Peter Geoghegan
the presorted input. I think that it isn't fair to credit our qsort with doing so well on a 100% presorted case, because it doesn't do the necessary bookkeeping to not throw that work away completely in certain important cases. -- Peter Geoghegan VMware vCenter Server https://www.vmw

Re: [HACKERS] [PATCH] Incremental sort

2017-04-26 Thread Peter Geoghegan
o as to not give too much credit to the "high risk" presort check optimization. The switch to insertion sort that we left in (not the bad one removed by a3f0b3d -- the insertion sort that actually comes from the B&M paper) does "legitimately" make sorting faster with pr

[HACKERS] A design for amcheck heapam verification

2017-04-28 Thread Peter Geoghegan
look like. The non-deterministic false negatives may need to be considered by the user visible interface, which is the main reason I mention it. [1] postgr.es/m/20161017014605.ga1220...@tornado.leadboat.com -- Peter Geoghegan VMware vCenter Server https://www.vmware.com/ -- Sent via pgsql-

Re: [HACKERS] Logical replication in the same cluster

2017-05-01 Thread Peter Geoghegan
. Is someone going to get around to fixing the problem for CREATE INDEX CONCURRENTLY (e.g., having extra steps to drop the useless index during recovery)? IIRC, this was always the plan. -- Peter Geoghegan VMware vCenter Server https://www.vmware.com/ -- Sent via pgsql-hackers mailing lis

Re: [HACKERS] Logical replication in the same cluster

2017-05-01 Thread Peter Geoghegan
a duplicate violation? I imagine that that's the much more common case. -- Peter Geoghegan VMware vCenter Server https://www.vmware.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Logical replication in the same cluster

2017-05-01 Thread Peter Geoghegan
od > idea, because it'll push down the likelihood of the issue below where > people will see it, but it'll still be likely enough for it to create > problems. I was concerned about that too. I have a hard time defending changes like this to myself, but it doesn't hurt to

Re: [HACKERS] A design for amcheck heapam verification

2017-05-01 Thread Peter Geoghegan
to receive an even share of memory). As I said, even if I was totally willing to duplicate the effort that went into respecting work_mem as a budget within places like tuplesort.c, having as little infrastructure code as possible is a specific goal for amcheck. [1] https://www.eecs.harvard.edu/

Re: [HACKERS] A design for amcheck heapam verification

2017-05-01 Thread Peter Geoghegan
On Fri, Apr 28, 2017 at 6:02 PM, Peter Geoghegan wrote: > - Is committed, and committed before RecentGlobalXmin. Actually, I guess amcheck would need to use its own scan's snapshot xmin instead. This is true because it cares about visibility in a way that's "backwards" re

Re: [HACKERS] A design for amcheck heapam verification

2017-05-01 Thread Peter Geoghegan
On Mon, May 1, 2017 at 2:10 PM, Peter Geoghegan wrote: > Actually, I guess amcheck would need to use its own scan's snapshot > xmin instead. This is true because it cares about visibility in a way > that's "backwards" relative to existing code that tests something &

Re: [HACKERS] A design for amcheck heapam verification

2017-05-01 Thread Peter Geoghegan
On Mon, May 1, 2017 at 4:28 PM, Peter Geoghegan wrote: > Anyone have an opinion on any of this? Offhand, I think that calling > GetOldestXmin() once per index when its "amcheck whole index scan" > finishes would be safe, and yet provide appreciably better test > covera

Re: [HACKERS] A design for amcheck heapam verification

2017-05-01 Thread Peter Geoghegan
with the MVCC snapshot's xmin in the first place -- I really don't have an opinion either way just yet. -- Peter Geoghegan VMware vCenter Server https://www.vmware.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PG 10 release notes

2017-05-04 Thread Peter Geoghegan
etely wrong when you said that. Lesson learned, I suppose. -- Peter Geoghegan VMware vCenter Server https://www.vmware.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] modeling parallel contention (was: Parallel Append implementation)

2017-05-05 Thread Peter Geoghegan
oo much effort into modelling concurrency ahead of optimizing serial performance. The machine's *aggregate* memory bandwidth should be used as efficiently as possible, and parallelism is just one (very important) tool for making that happen. -- Peter Geoghegan VMware vCenter Server https://

Re: [HACKERS] snapbuild woes

2017-05-11 Thread Peter Geoghegan
On Thu, May 11, 2017 at 2:51 PM, Andres Freund wrote: > Now that that's done, here's an updated version of that patch. Note the > new logic to trigger xl_running_xact's to be logged at the right spot. > Works well in my testing. You forgot the patch. :-) -- Peter G

Re: [HACKERS] A design for amcheck heapam verification

2017-05-11 Thread Peter Geoghegan
On Mon, May 1, 2017 at 6:39 PM, Peter Geoghegan wrote: > On Mon, May 1, 2017 at 6:20 PM, Tom Lane wrote: >> Maybe you can fix this by assuming that your own session's advertised xmin >> is a safe upper bound on everybody else's RecentGlobalXmin. But I'm not >

Re: [HACKERS] Hash Functions

2017-05-14 Thread Peter Geoghegan
whatever non-technical reasons remain are actually technical debt in disguise. Where this leaves hash partitioning, I cannot say. -- Peter Geoghegan VMware vCenter Server https://www.vmware.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subs

Re: [HACKERS] Hash Functions

2017-05-14 Thread Peter Geoghegan
is provided for by the application's client encoding. That's a great ideal to have, and one that is very close to completely workable. -- Peter Geoghegan VMware vCenter Server https://www.vmware.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] strcmp() tie-breaker for identical ICU-collated strings

2017-06-01 Thread Peter Geoghegan
If we didn't do a binary comparison as a > tie-breaker, wouldn't the result be logically incompatible with the = > operator, which does a binary comparison? I agree with that assessment. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.

Re: [HACKERS] strcmp() tie-breaker for identical ICU-collated strings

2017-06-02 Thread Peter Geoghegan
gular collations wouldn't need to change their behavior/implementation (to use ucol_equal() within texteq(), and so on). [1] http://unicode.org/reports/tr10/#Forcing_Deterministic_Comparisons -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PG10 transition tables, wCTEs and multiple operations on the same table

2017-06-06 Thread Peter Geoghegan
Table pass them to the trigger code explicitly. I suppose you'll need two tuplestores for the ON CONFLICT DO UPDATE case -- one for updated tuples, and the other for inserted tuples. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes

Re: [HACKERS] PG10 transition tables, wCTEs and multiple operations on the same table

2017-06-06 Thread Peter Geoghegan
On Tue, Jun 6, 2017 at 3:47 PM, Peter Geoghegan wrote: > I suppose you'll need two tuplestores for the ON CONFLICT DO UPDATE > case -- one for updated tuples, and the other for inserted tuples. Also, ISTM that the code within ENRMetadataGetTupDesc() probably requires more explanatio

Re: [HACKERS] PG10 transition tables, wCTEs and multiple operations on the same table

2017-06-06 Thread Peter Geoghegan
On Tue, Jun 6, 2017 at 5:01 PM, Peter Geoghegan wrote: > Also, ISTM that the code within ENRMetadataGetTupDesc() probably > requires more explanation, resource management wise. Also, it's not clear why it should be okay that the new type of ephemeral RTEs introduced don't have pe

Re: [HACKERS] PG10 transition tables, wCTEs and multiple operations on the same table

2017-06-07 Thread Peter Geoghegan
n't provide much in the way of guidance. My assumption about how transition tables ought to behave here is based on the simple fact that we already fire both AFTER statement-level triggers, plus my sense of aesthetics, or bias. I admit that I might be missing the point, but if I am it would

Re: [HACKERS] PG10 transition tables, wCTEs and multiple operations on the same table

2017-06-07 Thread Peter Geoghegan
On Wed, Jun 7, 2017 at 3:00 PM, Peter Geoghegan wrote: > My assumption would be that since you have as many as two > statement-level triggers firing that could reference transition tables > when ON CONFLICT DO UPDATE is used (one AFTER UPDATE statement level > trigger, and another

Re: [HACKERS] PG10 transition tables, wCTEs and multiple operations on the same table

2017-06-08 Thread Peter Geoghegan
at he'll be able to put in *sufficient* time, and in light of that concedes that it might be best to revert and revisit for Postgres 11. He is being cautious, and does not want to *risk* unduly holding up the release. That was my understanding, at least. -- Peter Geoghegan -- Sent via pgsql-h

Re: [HACKERS] PG10 transition tables, wCTEs and multiple operations on the same table

2017-06-08 Thread Peter Geoghegan
to discourage revert on the grounds that it's a slippery slope. Admitting fault doesn't need to be made any harder. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PG10 transition tables, wCTEs and multiple operations on the same table

2017-06-08 Thread Peter Geoghegan
studying > this thread and the patches and determining whether or not I'm willing > to take responsibility for this patch. Thank you. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GSOC'17 project introduction: Parallel COPY execution with errors handling

2017-06-08 Thread Peter Geoghegan
CONFLICT DO NOTHING/UPDATE to COPY seems > to be a large separated task and is out of the current project scope, but > maybe there is > a relatively simple way to somehow perform internally tuples insert with > ON CONFLICT DO NOTHING? I have added Peter Geoghegan to cc, as > I unders

Re: [HACKERS] strcmp() tie-breaker for identical ICU-collated strings

2017-06-09 Thread Peter Geoghegan
me users might even find it worth > giving up hashing in order to get the exact sort order they need. But they are getting the sort order they need. They just don't get the equality semantics they expect. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@post

Re: [HACKERS] strcmp() tie-breaker for identical ICU-collated strings

2017-06-09 Thread Peter Geoghegan
On Fri, Jun 9, 2017 at 10:45 AM, Robert Haas wrote: >> But they are getting the sort order they need. They just don't get the >> equality semantics they expect. > > You're right. If we happened to ever guarantee the user a stable sort, then I'd be wrong. We don

Re: [HACKERS] PG10 transition tables, wCTEs and multiple operations on the same table

2017-06-09 Thread Peter Geoghegan
On Thu, Jun 8, 2017 at 3:13 PM, Robert Haas wrote: > On Tue, Jun 6, 2017 at 8:19 PM, Peter Geoghegan wrote: >> On Tue, Jun 6, 2017 at 5:01 PM, Peter Geoghegan wrote: >>> Also, ISTM that the code within ENRMetadataGetTupDesc() probably >>> requires more explanatio

Re: [HACKERS] TPC-H Q20 from 1 hour to 19 hours!

2017-06-11 Thread Peter Geoghegan
ently selectivity estimation isn't particularly challenging with the TPC-H queries. I think that the big challenge for us is limitations like this; there are similar issues with a number of other TPC-H queries. It would be great if someone looked into implementing bitmap semi-join. -- Peter Geoghe

Re: [HACKERS] TPC-H Q20 from 1 hour to 19 hours!

2017-06-11 Thread Peter Geoghegan
ed this thread. Clearly Q20 is designed to reward systems that do better with moving predicates into subqueries, as opposed to systems with better selectivity estimation. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscrip

Re: [HACKERS] TPC-H Q20 from 1 hour to 19 hours!

2017-06-11 Thread Peter Geoghegan
On Sun, Jun 11, 2017 at 10:27 AM, Peter Geoghegan wrote: > Note that I introduced a new, redundant exists() in the agg_lineitem > fact table subquery. It now takes 23 seconds for me on Tomas' 10GB > TPC-H dataset, whereas the original query took over 90 minutes. > Clearly we'

Re: [HACKERS] TPC-H Q20 from 1 hour to 19 hours!

2017-06-11 Thread Peter Geoghegan
Q20 causing > choice of inefficient plan), it's a great paper to read. I thought I've > already posted a link to the this paper sometime in the past, but I don't > see it in the archives. Thanks for the tip! The practical focus of this paper really appeals to me. -- Peter

Re: [HACKERS] GSOC'17 project introduction: Parallel COPY execution with errors handling

2017-06-12 Thread Peter Geoghegan
HING feature). I haven't thought about this very carefully, but I guess you could do something like passing a flag to ExecConstraints() that indicates "don't throw an error; instead, just return false so I know not to proceed". Plus maybe one or two other cases, like using specula

Re: [HACKERS] Re: ICU collation variant keywords and pg_collation entries (Was: [BUGS] Crash report for some ICU-52 (debian8) COLLATE and work_mem values)

2017-08-21 Thread Peter Geoghegan
ometimes called natural sorting. See https://en.wikipedia.org/wiki/Natural_sort_order. Thanks -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: ICU collation variant keywords and pg_collation entries (Was: [BUGS] Crash report for some ICU-52 (debian8) COLLATE and work_mem values)

2017-08-21 Thread Peter Geoghegan
On Mon, Aug 21, 2017 at 9:33 AM, Peter Geoghegan wrote: > On Mon, Aug 21, 2017 at 8:23 AM, Peter Eisentraut > wrote: >> Here are my patches to address this. > > These look good. Also, I don't know why en-u-kr-others-digit wasn't accepted by CREATE COLLATION, as you s

Re: [HACKERS] Re: ICU collation variant keywords and pg_collation entries (Was: [BUGS] Crash report for some ICU-52 (debian8) COLLATE and work_mem values)

2017-08-21 Thread Peter Geoghegan
On Mon, Aug 21, 2017 at 4:48 PM, Peter Eisentraut wrote: > On 8/21/17 12:33, Peter Geoghegan wrote: >> On Mon, Aug 21, 2017 at 8:23 AM, Peter Eisentraut >> wrote: >>> Here are my patches to address this. >> >> These look good. > > Committed. That closes

Re: [HACKERS] Re: ICU collation variant keywords and pg_collation entries (Was: [BUGS] Crash report for some ICU-52 (debian8) COLLATE and work_mem values)

2017-08-25 Thread Peter Geoghegan
erbian (sr-*), regardless of what country code may also appear, even if the country code is not just obsolete, but entirely bogus. Events like the dissolution of countries are rare enough that that extra assurance is just a nice-to-have, though. [1] https://en.wikipedia.org/wiki/ISO_3166-2:CS -- Pet

Re: [HACKERS] pgbench: faster version of tpcb-like transaction

2017-08-26 Thread Peter Geoghegan
;-M prepared"? I think that most of us use that setting already, especially with CPU-bound workloads. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench: faster version of tpcb-like transaction

2017-08-26 Thread Peter Geoghegan
must admit that I had a similar unpleasant surprise at one point -- "-M prepared" seems to matter *a lot* these days. That's the default that I'd change, if any. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your su

Re: [HACKERS] A design for amcheck heapam verification

2017-08-29 Thread Peter Geoghegan
On Thu, May 11, 2017 at 4:30 PM, Peter Geoghegan wrote: > I spent only a few hours writing a rough prototype, and came up with > something that does an IndexBuildHeapScan() scan following the > existing index verification steps. Its amcheck callback does an > index_form_tuple() call

Re: [HACKERS] A design for amcheck heapam verification

2017-08-29 Thread Peter Geoghegan
p_bits_set(bloom_filter *filter) > +{ > +intbitset_bytes = NBITS(filter) / BITS_PER_BYTE; > +int64bits_set = 0; > +inti; > + > +for (i = 0; i < bitset_bytes; i++) > +{ > +unsigned char byte = filter->bitset[i]; > + > +

Re: [HACKERS] A design for amcheck heapam verification

2017-08-29 Thread Peter Geoghegan
o!). pop()/popcount() does seem like a clever algorithm, that we should probably think about adopting in some cases, but I should point at that the current caller to my bloom_prop_bits_set() function is an elog() DEBUG1 call. This is not at all performance critical. > + * Test if bloom filte

Re: [HACKERS] A design for amcheck heapam verification

2017-08-30 Thread Peter Geoghegan
27;s still not really noticeable. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Polyphase merge is obsolete

2017-08-30 Thread Peter Geoghegan
On Mon, Feb 27, 2017 at 2:45 PM, Peter Geoghegan wrote: > Since we have an awful lot of stuff in the last CF, and this patch > doesn't seem particularly strategic, I've marked it "Returned with > Feedback". I noticed that this is in the upcoming CF 1 for v11. I

Re: [HACKERS] The case for removing replacement selection sort

2017-08-30 Thread Peter Geoghegan
On Wed, Aug 30, 2017 at 12:51 PM, Robert Haas wrote: > On Fri, Jul 14, 2017 at 6:20 PM, Peter Geoghegan wrote: >> With the additional enhancements made to Postgres 10, I doubt that >> there are any remaining cases where it wins. > > The thing to do about that would be to co

Re: [HACKERS] Polyphase merge is obsolete

2017-08-30 Thread Peter Geoghegan
in this area, of which this is only the latest. I'm saying "hey, have you thought about RS too?". Whether or not I'm "hijacking" this thread is, at best, ambiguous. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.o

Re: [HACKERS] The case for removing replacement selection sort

2017-08-30 Thread Peter Geoghegan
he fact that we need about 10 merge passes without replacement selection, and only have enough memory for 7 tapes. I think that I could find a case that makes replacement selection look much worse, if I tried. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql

Re: [HACKERS] The case for removing replacement selection sort

2017-08-30 Thread Peter Geoghegan
On Wed, Aug 30, 2017 at 3:14 PM, Peter Geoghegan wrote: > This is significantly faster, in a way that's clearly reproducible and > consistent, despite the fact that we need about 10 merge passes > without replacement selection, and only have enough memory for 7 > tapes. I think

Re: [HACKERS] The case for removing replacement selection sort

2017-08-30 Thread Peter Geoghegan
rking, showing some benefit with RS with a work_mem of 8MB or less. As I said in my introduction on this thread, we weren't wrong to add replacement_sort_tuples to 9.6, given where things were with merging at the time. But, it does very much appear to create less than zero benefit these

Re: [HACKERS] The case for removing replacement selection sort

2017-08-31 Thread Peter Geoghegan
On Wed, Aug 30, 2017 at 4:59 PM, Peter Geoghegan wrote: > I may submit the simple patch to remove replacement selection, if > other contributors are receptive. Apart from everything else, the > "incrementalism" of replacement selection works against cleverer batch > memory

Re: [HACKERS] INSERT .. ON CONFLICT DO SELECT [FOR ..]

2017-09-03 Thread Peter Geoghegan
On Tue, Aug 15, 2017 at 12:17 AM, Marko Tiikkaja wrote: > On Tue, Aug 15, 2017 at 7:43 AM, Peter Geoghegan wrote: >> >> On Mon, Aug 14, 2017 at 6:23 PM, Marko Tiikkaja wrote: >> > Attached is a patch for $SUBJECT. It might still be a bit rough around >> > the

Re: [HACKERS] INSERT .. ON CONFLICT DO SELECT [FOR ..]

2017-09-04 Thread Peter Geoghegan
r? I actually think that ON CONFLICT DO NOTHING does have semantics that are, shall we say, questionable. That's the cost of having it not lock conflicting rows during big ETL operations. That's a huge practical benefit for ETL use-cases. Whereas here, with ON CONFLICT DO SELECT, I see

Re: [HACKERS] The case for removing replacement selection sort

2017-09-06 Thread Peter Geoghegan
quot;> but you removed that id. Thanks for looking into it. I suppose that the solution is to change the 9.6 release notes to no longer have a replacement_sort_tuples link. Anyone else have an opinion on that? -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgr

Re: [HACKERS] A design for amcheck heapam verification

2017-09-06 Thread Peter Geoghegan
On Wed, Aug 30, 2017 at 9:29 AM, Peter Geoghegan wrote: > On Wed, Aug 30, 2017 at 5:02 AM, Alvaro Herrera > wrote: >> Eh, if you want to optimize it for the case where debug output is not >> enabled, make sure to use ereport() not elog(). ereport() >> short-circuits

Re: [HACKERS] The case for removing replacement selection sort

2017-09-08 Thread Peter Geoghegan
rt_tuples just isn't that many tuples (150,000). * The upside of my patch is not inconsiderable: We can remove a lot of code, which will enable further improvements in the future, which are far more compelling (cleaner memory management, the use of memory batches during run generation). -- P

Re: [HACKERS] The case for removing replacement selection sort

2017-09-10 Thread Peter Geoghegan
egressions when upgrading from versions prior to 9.6 (9.6 is the version where we began to generally prefer quicksort). > Also, people often sort on > keys of more than one column Very true. I should test this. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@

Re: [HACKERS] The case for removing replacement selection sort

2017-09-10 Thread Peter Geoghegan
least for the data sets with 100k and 1M rows. The tests with 10M > rows will take much more time (it takes 1-2hours for a single work_mem > value, and we're testing 6 of them). I myself don't see that much value in a 10M row test. Thanks for volunteering to test! -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] The case for removing replacement selection sort

2017-09-10 Thread Peter Geoghegan
, it will not be due to the number of distinct tuples. But, if the extra time it takes doesn't matter to you, then it doesn't matter to me either. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] The case for removing replacement selection sort

2017-09-11 Thread Peter Geoghegan
performance improvement is just a bonus IMV. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] The case for removing replacement selection sort

2017-09-11 Thread Peter Geoghegan
On Mon, Sep 11, 2017 at 8:32 AM, Robert Haas wrote: > On Sun, Sep 10, 2017 at 9:39 PM, Peter Geoghegan wrote: >> To be clear, you'll still need to set replacement_sort_tuples high >> when testing RS, to make sure that we really use it for at least the >> first run when

Re: [HACKERS] The case for removing replacement selection sort

2017-09-11 Thread Peter Geoghegan
urt than to help, because of the enhancements to merging that went into Postgres 10 reduced the downside of not using replacement selection. And so, for Postgres 11 replacement_sort_tuples deserves to be removed. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql

Re: [HACKERS] The case for removing replacement selection sort

2017-09-11 Thread Peter Geoghegan
within commit 24598337c, and memory utilization for merging got better, too. Very roughly speaking, merging attained the same advantage that replacement selection had all along, and replacement selection lost all ability to compete. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgs

Re: [HACKERS] CLUSTER command progress monitor

2017-09-11 Thread Peter Geoghegan
component isn't where the real costs are. Profiling shows that writing out the new heap (including moderately complicated bookkeeping) is the bottleneck, IIRC. That's why parallel CLUSTER didn't look attractive, even though it would be a fairly straightforward matter to add that on top

Re: [HACKERS] The case for removing replacement selection sort

2017-09-11 Thread Peter Geoghegan
On Wed, Sep 6, 2017 at 2:55 PM, Peter Geoghegan wrote: > On Wed, Sep 6, 2017 at 2:47 PM, Thomas Munro > wrote: >>> I attach a patch to remove replacement selection, which I'll submit to CF 1. >> >> This breaks the documentation build, because >> doc/sr

Re: [HACKERS] Automatic testing of patches in commit fest

2017-09-12 Thread Peter Geoghegan
ing" techniques, which are obviously fairly brittle. Similarly, Thomas' patch testing web application should itself have a web API, potentially usable by the CF app. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription

Re: [HACKERS] Trouble with amcheck

2017-09-14 Thread Peter Geoghegan
x27;m stumped as to > what. I think you need to build and install contrib, so that it is available to the server that you're running an installcheck against. amcheck is alphabetically first among contrib modules that have tests, IIRC. -- Peter Geoghegan -- Sent via pgsql-

Re: [HACKERS] The case for removing replacement selection sort

2017-09-15 Thread Peter Geoghegan
/O costs. This is the opposite benefit to the one you'd expect from reading Knuth. Thanks for benchmarking! I hope that this removes the doubt that replacement selection previously benefited from; it now clearly deserves to be removed from tuplesort.c. -- Peter Geoghegan -- Sent

Re: [HACKERS] The case for removing replacement selection sort

2017-09-15 Thread Peter Geoghegan
t-bench-e5-5450 > > At this point the 10M row tests are running, but I don't expect anything > particularly surprising from those results. That is, it's not something > that should block getting this patch committed, if the agreement is to > commit otherwise. >

Re: [HACKERS] A design for amcheck heapam verification

2017-09-16 Thread Peter Geoghegan
On Wed, Sep 6, 2017 at 7:26 PM, Peter Geoghegan wrote: > On Wed, Aug 30, 2017 at 9:29 AM, Peter Geoghegan wrote: >> On Wed, Aug 30, 2017 at 5:02 AM, Alvaro Herrera >> wrote: >>> Eh, if you want to optimize it for the case where debug output is not >>> enable

Re: [HACKERS] valgrind vs. shared typmod registry

2017-09-16 Thread Peter Geoghegan
d been failing within select_parallel only following the commit that you mentioned: https://buildfarm.postgresql.org/cgi-bin/show_history.pl?nm=skink&br=HEAD -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.post

[HACKERS] ICU locales and text/char(n) SortSupport on Windows

2017-09-16 Thread Peter Geoghegan
ways accepted as legacy that ICU had to live with. Maybe a static assertion is all that we need here (ICU builds must also be USE_WIDE_UPPER_LOWER builds). -- Peter Geoghegan From fca07aca51fb7979f19180712707233ff0e6f4b4 Mon Sep 17 00:00:00 2001 From: Peter Geoghegan Date: Sat, 16 Sep 2017 13:36:

Re: [HACKERS] Boom filters for hash joins (was: A design for amcheck heapam verification)

2017-09-18 Thread Peter Geoghegan
low cost, then maybe it makes sense to apply bloom filters as an opportunistic or optimistic optimization. Perhaps they can be applied when there is little to lose but much to gain. [1] http://www.ccs.neu.edu/home/pete/pub/bloom-filters-verification.pdf -- Peter Geoghegan -- Sent via pgsql-hackers m

Re: [HACKERS] Boom filters for hash joins (was: A design for amcheck heapam verification)

2017-09-18 Thread Peter Geoghegan
is is all pretty speculative. I suspect that this could be true, and it seems worth investigating that framing of the problem first. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] CREATE COLLATION does not sanitize ICU's BCP 47 language tags. Should it?

2017-09-18 Thread Peter Geoghegan
ard. [1] postgr.es/m/cah2-wzm22vtxvd-e1oz90de8z_m61_8amhsdozf1pwrkfrm...@mail.gmail.com [2] https://ssl.icu-project.org/apiref/icu4c/uloc_8h.html#a1d50c91925ca3853fce6f28cf7390c3c -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GUC for cleanup indexes threshold.

2017-09-18 Thread Peter Geoghegan
, and has > received plenty feedback this CF. I'll therefore move this to the next > commitfest. Does anyone have ideas on a way forward here? I don't, but then I haven't thought about it in detail in several months. -- Peter Geoghegan -- Sent via pgsql-hackers mailing lis

Re: [HACKERS] Boom filters for hash joins (was: A design for amcheck heapam verification)

2017-09-19 Thread Peter Geoghegan
om the outer relation has a match in the hash table). I believe that parallelism makes the use of Bloom filter a lot more compelling, too. Obviously this is something that wasn't taken into consideration in 2015. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgr

Re: [HACKERS] CREATE COLLATION does not sanitize ICU's BCP 47 language tags. Should it?

2017-09-19 Thread Peter Geoghegan
on in place then it wouldn't > be an issue ;-) I was proposing that this be treated as an open item for v10; sorry if I was unclear on that. Much like the "ICU locales vs. ICU collations within pg_collation" issue, this seems like the kind of thing that we ought to go out of our w

<    3   4   5   6   7   8   9   10   11   12   >