Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2015-01-23 Thread Robert Haas
On Wed, Jan 21, 2015 at 2:22 AM, Peter Geoghegan wrote: > You'll probably prefer the attached. This patch works by disabling > abbreviation, but only after writing out runs, with the final merge > left to go. That way, it doesn't matter when abbreviated keys are not > read back from disk (or regen

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2015-01-23 Thread Robert Haas
On Fri, Jan 23, 2015 at 2:18 AM, David Rowley wrote: > On 20 January 2015 at 17:10, Peter Geoghegan wrote: >> >> On Mon, Jan 19, 2015 at 7:47 PM, Michael Paquier >> wrote: >> >> > With your patch applied, the failure with MSVC disappeared, but there >> > is still a warning showing up: >> > (ClCo

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2015-01-22 Thread David Rowley
On 20 January 2015 at 17:10, Peter Geoghegan wrote: > On Mon, Jan 19, 2015 at 7:47 PM, Michael Paquier > wrote: > > > With your patch applied, the failure with MSVC disappeared, but there > > is still a warning showing up: > > (ClCompile target) -> > > src\backend\lib\hyperloglog.c(73): warnin

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2015-01-21 Thread Andrew Gierth
> "Peter" == Peter Geoghegan writes: Peter> Okay, then. I concede the point: We should support the datum Peter> case as you outline, since it is simpler than any Peter> alternative. It probably won't even be necessary to formalize Peter> the idea that finished abbreviated keys must be pas

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2015-01-21 Thread Peter Geoghegan
On Wed, Jan 21, 2015 at 2:11 PM, Peter Geoghegan wrote: > Okay, then. I concede the point: We should support the datum case as > you outline, since it is simpler than any alternative. It probably > won't even be necessary to formalize the idea that finished > abbreviated keys must be pass-by-value

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2015-01-21 Thread Peter Geoghegan
On Wed, Jan 21, 2015 at 4:44 AM, Andrew Gierth wrote: > Now, I follow this general principle that someone who is not doing the > work should never say "X is easy" to someone who _is_ doing it, unless > they're prepared to at least outline the solution on request or > otherwise contribute. So see

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2015-01-21 Thread Andrew Gierth
> "Peter" == Peter Geoghegan writes: Peter> Basically, the intersection of the datum sort case with Peter> abbreviated keys seems complicated. Not to me. To me it seems completely trivial. Now, I follow this general principle that someone who is not doing the work should never say "X is e

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2015-01-21 Thread Andrew Gierth
> "Peter" == Peter Geoghegan writes: Peter> You'll probably prefer the attached. This patch works by Peter> disabling abbreviation, but only after writing out runs, with Peter> the final merge left to go. That way, it doesn't matter when Peter> abbreviated keys are not read back from disk

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2015-01-20 Thread Peter Geoghegan
On Tue, Jan 20, 2015 at 6:39 PM, Peter Geoghegan wrote: > On Tue, Jan 20, 2015 at 6:34 PM, Robert Haas wrote: >> That might be OK. Probably needs a bit of performance testing to see >> how it looks. > > Well, we're still only doing it when we do our final merge. So that's > "only" doubling the n

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2015-01-20 Thread Peter Geoghegan
On Tue, Jan 20, 2015 at 6:34 PM, Robert Haas wrote: > That might be OK. Probably needs a bit of performance testing to see > how it looks. Well, we're still only doing it when we do our final merge. So that's "only" doubling the number of conversions required, which if we're blocked on I/O might

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2015-01-20 Thread Robert Haas
On Tue, Jan 20, 2015 at 9:33 PM, Peter Geoghegan wrote: > On Tue, Jan 20, 2015 at 6:30 PM, Robert Haas wrote: >> I don't want to change the on-disk format for tapes without a lot more >> discussion. Can you come up with a fix that avoids that for now? > > A more conservative approach would be to

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2015-01-20 Thread Peter Geoghegan
On Tue, Jan 20, 2015 at 6:30 PM, Robert Haas wrote: > I don't want to change the on-disk format for tapes without a lot more > discussion. Can you come up with a fix that avoids that for now? A more conservative approach would be to perform conversion on-the-fly once more. That wouldn't be paten

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2015-01-20 Thread Robert Haas
On Tue, Jan 20, 2015 at 8:39 PM, Peter Geoghegan wrote: > On Tue, Jan 20, 2015 at 5:32 PM, Robert Haas wrote: >> I was assuming we were going to fix this by undoing the abbreviation >> (as in the abort case) when we spill to disk, and not bothering with >> it thereafter. > > The spill-to-disk cas

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2015-01-20 Thread Peter Geoghegan
On Tue, Jan 20, 2015 at 5:46 PM, Peter Geoghegan wrote: > Would you prefer it if the spill-to-disk case > aborted in the style of low entropy keys? That doesn't seem > significantly safer than this, and it certainly not acceptable from a > performance perspective. BTW, I can write that patch if t

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2015-01-20 Thread Peter Geoghegan
On Tue, Jan 20, 2015 at 5:42 PM, Robert Haas wrote: > On Tue, Jan 20, 2015 at 8:39 PM, Peter Geoghegan wrote: >> On Tue, Jan 20, 2015 at 5:32 PM, Robert Haas wrote: >>> I was assuming we were going to fix this by undoing the abbreviation >>> (as in the abort case) when we spill to disk, and not

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2015-01-20 Thread Robert Haas
On Tue, Jan 20, 2015 at 8:39 PM, Peter Geoghegan wrote: > On Tue, Jan 20, 2015 at 5:32 PM, Robert Haas wrote: >> I was assuming we were going to fix this by undoing the abbreviation >> (as in the abort case) when we spill to disk, and not bothering with >> it thereafter. > > The spill-to-disk cas

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2015-01-20 Thread Peter Geoghegan
On Tue, Jan 20, 2015 at 5:32 PM, Robert Haas wrote: > I was assuming we were going to fix this by undoing the abbreviation > (as in the abort case) when we spill to disk, and not bothering with > it thereafter. The spill-to-disk case is at least as compelling at the internal sort case. The overhe

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2015-01-20 Thread Robert Haas
On Tue, Jan 20, 2015 at 7:07 PM, Peter Geoghegan wrote: > On Tue, Jan 20, 2015 at 3:57 PM, Peter Geoghegan wrote: >> It's certainly possible to fix Andrew's test case with the attached. >> I'm not sure that that's the appropriate fix, though: there is >> probably a case to be made for not botheri

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2015-01-20 Thread Peter Geoghegan
On Tue, Jan 20, 2015 at 3:57 PM, Peter Geoghegan wrote: > It's certainly possible to fix Andrew's test case with the attached. > I'm not sure that that's the appropriate fix, though: there is > probably a case to be made for not bothering with abbreviation once > we've read tuples in for the final

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2015-01-20 Thread Peter Geoghegan
On Tue, Jan 20, 2015 at 3:34 PM, Peter Geoghegan wrote: > On Tue, Jan 20, 2015 at 3:34 PM, Robert Haas wrote: >> Dear me. Peter, can you fix this RSN? > > Investigating. It's certainly possible to fix Andrew's test case with the attached. I'm not sure that that's the appropriate fix, though: th

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2015-01-20 Thread Peter Geoghegan
On Tue, Jan 20, 2015 at 3:34 PM, Robert Haas wrote: > Dear me. Peter, can you fix this RSN? Investigating. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2015-01-20 Thread Peter Geoghegan
On Tue, Jan 20, 2015 at 3:33 PM, Robert Haas wrote: > Peter, this made bowerbird (Windows 8/Visual Studio) build, but it's > failing make check. Ditto hamerkop (Windows 2k8/VC++) and currawong > (Windows XP Pro/MSVC++). jacana (Windows 8/gcc) and brolga (Windows > XP Pro/cygwin) are unhappy too,

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2015-01-20 Thread Robert Haas
On Tue, Jan 20, 2015 at 6:27 PM, Andrew Gierth wrote: >> "Robert" == Robert Haas writes: > Robert> All right, it seems Tom is with you on that point, so after > Robert> some study, I've committed this with very minor modifications. > > While hacking up a patch to demonstrate the simplicity

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2015-01-20 Thread Robert Haas
On Tue, Jan 20, 2015 at 10:54 AM, Robert Haas wrote: > On Mon, Jan 19, 2015 at 9:29 PM, Peter Geoghegan wrote: >> I think that the attached patch should at least fix that much. Maybe >> the problem on the other animal is also explained by the lack of this, >> since there could also be a MinGW-ish

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2015-01-20 Thread Andrew Gierth
> "Robert" == Robert Haas writes: Robert> All right, it seems Tom is with you on that point, so after Robert> some study, I've committed this with very minor modifications. While hacking up a patch to demonstrate the simplicity of extending this to the Datum sorter, I seem to have run into

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2015-01-20 Thread Peter Geoghegan
On Tue, Jan 20, 2015 at 2:00 PM, Peter Geoghegan wrote: > Maybe that's the > wrong way of fixing that, but for now I don't think it's acceptable > that abbreviation isn't always used in certain cases where it could > make sense (e.g. not for simple GroupAggregates with a single > attribute -- only

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2015-01-20 Thread Peter Geoghegan
On Tue, Jan 20, 2015 at 3:46 AM, Andrew Gierth wrote: > The comment in tuplesort_begin_datum that abbreviation can't be used > seems wrong to me; why is the copy of the original value pointed to by > stup->tuple (in the case of by-reference types, and abbreviation is > obviously not needed for by-

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2015-01-20 Thread Robert Haas
On Mon, Jan 19, 2015 at 9:29 PM, Peter Geoghegan wrote: > I think that the attached patch should at least fix that much. Maybe > the problem on the other animal is also explained by the lack of this, > since there could also be a MinGW-ish strxfrm_l(), I suppose. Committed that, rather blindly, s

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2015-01-20 Thread Andrew Gierth
> "Robert" == Robert Haas writes: Robert> All right, it seems Tom is with you on that point, so after Robert> some study, I've committed this with very minor modifications. This caught my eye (thanks to conflict with GS patch): * In the future, we should consider forcing the * tuplesort

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2015-01-19 Thread Peter Geoghegan
On Mon, Jan 19, 2015 at 7:47 PM, Michael Paquier wrote: > On MinGW-32, not that I know of: > $ find . -name *.h | xgrep strxfrm_l > ./lib/gcc/mingw32/4.8.1/include/c++/mingw32/bits/c++config.h:/* Define if > strxfr > m_l is available in . */ > ./mingw32/lib/gcc/mingw32/4.8.1/include/c++/mingw32/b

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2015-01-19 Thread Michael Paquier
On Tue, Jan 20, 2015 at 11:29 AM, Peter Geoghegan wrote: > On Mon, Jan 19, 2015 at 5:59 PM, Peter Geoghegan wrote: >> On Mon, Jan 19, 2015 at 5:33 PM, Alvaro Herrera >> wrote: >>> You did notice that bowerbird isn't building, right? >>> http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=bowe

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2015-01-19 Thread Peter Geoghegan
On Mon, Jan 19, 2015 at 5:59 PM, Peter Geoghegan wrote: > On Mon, Jan 19, 2015 at 5:33 PM, Alvaro Herrera > wrote: >> You did notice that bowerbird isn't building, right? >> http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=bowerbird&dt=2015-01-19%2023%3A54%3A46 > > Yeah. Looks like strxfrm_

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2015-01-19 Thread Peter Geoghegan
On Mon, Jan 19, 2015 at 5:33 PM, Alvaro Herrera wrote: > You did notice that bowerbird isn't building, right? > http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=bowerbird&dt=2015-01-19%2023%3A54%3A46 Yeah. Looks like strxfrm_l() isn't available on the animal, for whatever reason. -- Peter

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2015-01-19 Thread Alvaro Herrera
Peter Geoghegan wrote: > It appears that the buildfarm animal brolga isn't happy about this > patch. I'm not sure why, since I thought we already figured out bugs > or other inconsistencies in various strxfrm() implementations. You did notice that bowerbird isn't building, right? http://buildfarm

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2015-01-19 Thread Robert Haas
On Mon, Jan 19, 2015 at 5:43 PM, Peter Geoghegan wrote: > It appears that the buildfarm animal brolga isn't happy about this > patch. I'm not sure why, since I thought we already figured out bugs > or other inconsistencies in various strxfrm() implementations. Well, the first thing that comes to

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2015-01-19 Thread Peter Geoghegan
On Mon, Jan 19, 2015 at 12:33 PM, Robert Haas wrote: > All right, it seems Tom is with you on that point, so after some > study, I've committed this with very minor modifications. Sorry for > the long delay. Thank you very much for your help with this! I appreciate it. > I have not committed th

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2015-01-19 Thread Stephen Frost
* Robert Haas (robertmh...@gmail.com) wrote: > On the PPC64 machine I normally use for performance testing, it takes > about 6.3 seconds to build the index with the commit just before this > one. With this commit, it drops to 1.9 seconds. That's more than a > 3x speedup! > > Now, if I change the

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2015-01-19 Thread Robert Haas
On Mon, Jan 19, 2015 at 3:33 PM, Robert Haas wrote: > All right, it seems Tom is with you on that point, so after some > study, I've committed this with very minor modifications. Sorry for > the long delay. I have not committed the 0002 patch, though, because > I haven't studied that enough yet

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2015-01-19 Thread Robert Haas
On Tue, Dec 2, 2014 at 8:28 PM, Peter Geoghegan wrote: > On Tue, Dec 2, 2014 at 2:16 PM, Peter Geoghegan wrote: >> On Tue, Dec 2, 2014 at 2:07 PM, Robert Haas wrote: >>> Well, maybe you should make the updates we've agreed on and I can take >>> another look at it. >> >> Agreed. > > Attached, rev

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-12-14 Thread Michael Paquier
On Wed, Dec 3, 2014 at 10:43 AM, Peter Geoghegan wrote: > On Tue, Dec 2, 2014 at 5:28 PM, Peter Geoghegan wrote: >> Attached, revised patchset makes these updates. > > Whoops. Missed some obsolete comments. Here is a third commit that > makes a further small modification to one comment. Moving th

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-12-09 Thread Peter Geoghegan
There is an interesting thread about strcoll() overhead over on -general: http://www.postgresql.org/message-id/cab25xexnondrmc1_cy3jvmb0tmydm38ef9q2d7xla0rbncj...@mail.gmail.com My guess was that this person experienced a rather unexpected downside of spilling to disk when sorting on a text attri

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-12-03 Thread Peter Geoghegan
On Tue, Dec 2, 2014 at 1:21 PM, Peter Geoghegan wrote: > Incidentally, I think that an under-appreciated possible source of > regressions here is that attributes abbreviated have a strong > physical/logical correlation. I could see a small regression for one > such case even though my cost model i

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-12-03 Thread Robert Haas
On Tue, Dec 2, 2014 at 5:44 PM, Tom Lane wrote: > Peter Geoghegan writes: >> On Tue, Dec 2, 2014 at 2:21 PM, Robert Haas wrote: >>> Right, and what I'm saying is that maybe the "applicability" flag >>> shouldn't be stored in the SortSupport object, but passed down as an >>> argument. > >> But th

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-12-02 Thread Peter Geoghegan
On Tue, Dec 2, 2014 at 5:28 PM, Peter Geoghegan wrote: > Attached, revised patchset makes these updates. Whoops. Missed some obsolete comments. Here is a third commit that makes a further small modification to one comment. -- Peter Geoghegan From 8d1aba80f95e05742047cba5bd83d8f17aa5ef37 Mon Sep

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-12-02 Thread Peter Geoghegan
On Tue, Dec 2, 2014 at 2:16 PM, Peter Geoghegan wrote: > On Tue, Dec 2, 2014 at 2:07 PM, Robert Haas wrote: >> Well, maybe you should make the updates we've agreed on and I can take >> another look at it. > > Agreed. Attached, revised patchset makes these updates. I continue to use the sortsuppo

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-12-02 Thread Tom Lane
Peter Geoghegan writes: > On Tue, Dec 2, 2014 at 2:21 PM, Robert Haas wrote: >> Right, and what I'm saying is that maybe the "applicability" flag >> shouldn't be stored in the SortSupport object, but passed down as an >> argument. > But then how does that information get to any given sortsupport

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-12-02 Thread Peter Geoghegan
On Tue, Dec 2, 2014 at 2:21 PM, Robert Haas wrote: > Right, and what I'm saying is that maybe the "applicability" flag > shouldn't be stored in the SortSupport object, but passed down as an > argument. But then how does that information get to any given sortsupport routine? That's the place that

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-12-02 Thread Robert Haas
On Tue, Dec 2, 2014 at 5:16 PM, Peter Geoghegan wrote: > On Tue, Dec 2, 2014 at 2:07 PM, Robert Haas wrote: >> Well, maybe you should make the updates we've agreed on and I can take >> another look at it. > > Agreed. > >> But I didn't think that I was proposing to change >> anything about the lev

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-12-02 Thread Peter Geoghegan
On Tue, Dec 2, 2014 at 2:07 PM, Robert Haas wrote: > Well, maybe you should make the updates we've agreed on and I can take > another look at it. Agreed. > But I didn't think that I was proposing to change > anything about the level at which the decision about whether to > abbreviate or not was

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-12-02 Thread Robert Haas
On Tue, Dec 2, 2014 at 4:21 PM, Peter Geoghegan wrote: >>> I'm not sure about that. I'd prefer to have tuplesort (and one or two >>> other sites) set the "abbreviation is possible in principle" flag. >>> Otherwise, sortsupport needs to assume that the leading attribute is >>> going to be the abbre

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-12-02 Thread Peter Geoghegan
On Tue, Dec 2, 2014 at 1:00 PM, Robert Haas wrote: > I'd prefer not to have a #define in pg_config_manual.h. Only stuff > that we expect a reasonably decent number of users to need to change > should be in that file, and this is too marginal for that. If anybody > other than the developers of th

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-12-02 Thread Robert Haas
On Tue, Nov 25, 2014 at 1:38 PM, Peter Geoghegan wrote: > On Tue, Nov 25, 2014 at 4:01 AM, Robert Haas wrote: >> - This appears to needlessly reindent the comments for PG_CACHE_LINE_SIZE. > > Actually, the word "only" is removed (because PG_CACHE_LINE_SIZE has a > new client now). So it isn't qui

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-11-30 Thread Peter Geoghegan
On Tue, Nov 25, 2014 at 4:01 AM, Robert Haas wrote: > There's a lot of stuff in this patch I'm still trying to digest I spotted a bug in the most recent revision. Mea culpa. I think that the new field Tuplesortstate.abbrevNext should be an int64, not an int. The fact that Tuplesortstate.memtupco

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-11-25 Thread Peter Geoghegan
On Tue, Nov 25, 2014 at 10:38 AM, Peter Geoghegan wrote: >> - Also, I don't think making abbrev_state an enumerated value with two >> values is really doing anything for us; we could just use a Boolean. >> I'm wondering if we should actually go a bit further and remove this >> from the SortSupport

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-11-25 Thread Peter Geoghegan
On Tue, Nov 25, 2014 at 4:01 AM, Robert Haas wrote: > - This appears to needlessly reindent the comments for PG_CACHE_LINE_SIZE. Actually, the word "only" is removed (because PG_CACHE_LINE_SIZE has a new client now). So it isn't quite the same paragraph as before. > - I really don't think we nee

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-11-25 Thread Robert Haas
On Sun, Nov 9, 2014 at 10:02 PM, Peter Geoghegan wrote: > On Sat, Oct 11, 2014 at 6:34 PM, Peter Geoghegan wrote: >> Attached patch, when applied, accelerates all tuplesort cases using >> abbreviated keys, building on previous work here, as well as the patch >> posted to that other thread. > > I

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-11-09 Thread Peter Geoghegan
On Sat, Oct 11, 2014 at 6:34 PM, Peter Geoghegan wrote: > Attached patch, when applied, accelerates all tuplesort cases using > abbreviated keys, building on previous work here, as well as the patch > posted to that other thread. I attach an updated patch set, rebased on top of the master branch'

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-10-11 Thread Peter Geoghegan
On Mon, Sep 29, 2014 at 10:34 PM, Peter Geoghegan wrote: > . You probably noticed that I posted an independently useful patch to make all tuplesort cases use sortsupport [1] - currently, both the B-Tree and CLUSTER cases do not use the sortsupport infrastructure more or less for no good reason. T

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-09-29 Thread Peter Geoghegan
On Thu, Sep 25, 2014 at 1:36 PM, Robert Haas wrote: > (concerns about a second sortsupport state) I think I may have underestimated the cost of not have sorttuple.datum1 with a pointer-to-text representation available in cases such as the one you describe. Attached revision introduces an alterna

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-09-25 Thread Robert Haas
On Thu, Sep 25, 2014 at 3:17 PM, Peter Geoghegan wrote: >> To find out how much that optimization buys, you >> should use tuples with many variable-length columns (say, 50) >> preceding the text column you're sorting on. I won't be surprised if >> that turns out to be expensive enough to be worth

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-09-25 Thread Peter Geoghegan
On Thu, Sep 25, 2014 at 11:53 AM, Robert Haas wrote: > I haven't looked at that part of the patch in detail yet, so... not > really. But I don't see why you'd ever need to restart heap tuple > copying. At most you'd need to re-extract datum1 from the tuples you > have already copied. Well, okay

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-09-25 Thread Robert Haas
On Thu, Sep 25, 2014 at 2:05 PM, Peter Geoghegan wrote: > On Thu, Sep 25, 2014 at 9:21 AM, Robert Haas wrote: >> The top issue on my agenda is figuring out a way to get rid of the >> extra SortSupport object. > > Really? I'm surprised. Clearly the need to restart heap tuple copying > from scratch

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-09-25 Thread Peter Geoghegan
On Thu, Sep 25, 2014 at 9:21 AM, Robert Haas wrote: > The top issue on my agenda is figuring out a way to get rid of the > extra SortSupport object. Really? I'm surprised. Clearly the need to restart heap tuple copying from scratch, in order to make the datum1 representation consistent, rather th

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-09-25 Thread Robert Haas
On Wed, Sep 24, 2014 at 7:04 PM, Peter Geoghegan wrote: > On Fri, Sep 19, 2014 at 2:54 PM, Peter Geoghegan wrote: >> Probably not - it appears to make very little difference to >> unoptimized pass-by-reference types whether or not datum1 can be used >> (see my simulation of Kevin's worst case, fo

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-09-24 Thread Peter Geoghegan
On Fri, Sep 19, 2014 at 2:54 PM, Peter Geoghegan wrote: > Probably not - it appears to make very little difference to > unoptimized pass-by-reference types whether or not datum1 can be used > (see my simulation of Kevin's worst case, for example [1]). Streaming > through a not inconsiderable propo

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-09-19 Thread Peter Geoghegan
On Fri, Sep 19, 2014 at 2:35 PM, Robert Haas wrote: > Also, shouldn't you go back and fix up > those abbreviated keys to point to datum1 again if you abort? Probably not - it appears to make very little difference to unoptimized pass-by-reference types whether or not datum1 can be used (see my si

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-09-19 Thread Robert Haas
On Thu, Sep 11, 2014 at 8:34 PM, Peter Geoghegan wrote: > On Tue, Sep 9, 2014 at 2:25 PM, Robert Haas wrote: >>> I like that I don't have to care about every combination, and can >>> treat abbreviation abortion as the special case with the extra step, >>> in line with how I think of the optimizat

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-09-19 Thread Peter Geoghegan
On Fri, Sep 19, 2014 at 9:59 AM, Robert Haas wrote: > OK, good point. So committed as-is, then, except that I rewrote the > comments, which I felt were excessively long for the amount of code. Thanks! I look forward to hearing your thoughts on the open issues with the patch as a whole. -- Pete

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-09-19 Thread Robert Haas
On Tue, Sep 16, 2014 at 4:55 PM, Peter Geoghegan wrote: > On Tue, Sep 16, 2014 at 1:45 PM, Robert Haas wrote: >> Even though our testing seems to indicate that the memcmp() is >> basically free, I think it would be good to make the effort to avoid >> doing memcmp() and then strcoll() and then str

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-09-16 Thread Peter Geoghegan
On Tue, Sep 16, 2014 at 1:45 PM, Robert Haas wrote: > Even though our testing seems to indicate that the memcmp() is > basically free, I think it would be good to make the effort to avoid > doing memcmp() and then strcoll() and then strncmp(). Seems like it > shouldn't be too hard. Really? The t

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-09-16 Thread Robert Haas
On Mon, Sep 15, 2014 at 7:21 PM, Peter Geoghegan wrote: > On Mon, Sep 15, 2014 at 11:25 AM, Peter Geoghegan wrote: >> OK, I'll draft a patch for that today, including similar alterations >> to varstr_cmp() for the benefit of Windows and so on. > > I attach a much simpler patch, that only adds an

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-09-15 Thread Peter Geoghegan
On Mon, Sep 15, 2014 at 4:21 PM, Peter Geoghegan wrote: > I attach a much simpler patch, that only adds an opportunistic > "memcmp() == 0" before a possible strcoll(). Both > bttextfastcmp_locale() and varstr_cmp() have the optimization added, > since there is no point in leaving anyone out for t

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-09-15 Thread Peter Geoghegan
On Mon, Sep 15, 2014 at 11:25 AM, Peter Geoghegan wrote: > OK, I'll draft a patch for that today, including similar alterations > to varstr_cmp() for the benefit of Windows and so on. I attach a much simpler patch, that only adds an opportunistic "memcmp() == 0" before a possible strcoll(). Both

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-09-15 Thread Peter Geoghegan
On Mon, Sep 15, 2014 at 11:20 AM, Robert Haas wrote: > ...looks like about a 10-line patch. We have the data to show that > the loss is trivial even in the worst case, and we have or should be > able to get data showing that the best-case win is significant even > without the abbreviated key stuf

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-09-15 Thread Robert Haas
On Mon, Sep 15, 2014 at 1:55 PM, Peter Geoghegan wrote: > On Mon, Sep 15, 2014 at 10:53 AM, Robert Haas wrote: >> I think there's probably more than that to work out, but in any case >> there's no harm in getting a simple optimization done first before >> moving on to a complicated one. > > I gue

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-09-15 Thread Peter Geoghegan
On Mon, Sep 15, 2014 at 10:53 AM, Robert Haas wrote: > I think there's probably more than that to work out, but in any case > there's no harm in getting a simple optimization done first before > moving on to a complicated one. I guess we never talked about the abort logic in all that much detail.

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-09-15 Thread Robert Haas
On Mon, Sep 15, 2014 at 1:34 PM, Peter Geoghegan wrote: > On Mon, Sep 15, 2014 at 10:17 AM, Robert Haas wrote: >> It strikes me that perhaps we should make this change (rearranging >> things so that the memcmp tiebreak is run before strcoll) first, >> before dealing with the rest of the abbreviat

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-09-15 Thread Peter Geoghegan
On Mon, Sep 15, 2014 at 10:17 AM, Robert Haas wrote: > It strikes me that perhaps we should make this change (rearranging > things so that the memcmp tiebreak is run before strcoll) first, > before dealing with the rest of the abbreviated keys infrastructure. > It appears to be a separate improvem

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-09-15 Thread Robert Haas
On Sun, Sep 14, 2014 at 10:37 AM, Heikki Linnakangas wrote: > On 09/13/2014 11:28 PM, Peter Geoghegan wrote: >> Anyway, attached rough test program implements what you outline. This >> is for 30,000 32 byte strings (where just the final two bytes differ). >> On my laptop, output looks like this (e

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-09-15 Thread Heikki Linnakangas
On 09/14/2014 11:34 PM, Peter Geoghegan wrote: On Sun, Sep 14, 2014 at 7:37 AM, Heikki Linnakangas wrote: Both values vary in range 5.9 - 6.1 s, so it's fair to say that the useless memcmp() is free with these parameters. Is this the worst case scenario? Other than pushing the differences mu

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-09-14 Thread Peter Geoghegan
On Sun, Sep 14, 2014 at 7:37 AM, Heikki Linnakangas wrote: > Got to be careful to not let the compiler optimize away microbenchmarks like > this. At least with my version of gcc, the strcoll calls get optimized away, > as do the memcmp calls, if you don't use the result for anything. Clang was > e

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-09-14 Thread Heikki Linnakangas
On 09/13/2014 11:28 PM, Peter Geoghegan wrote: Anyway, attached rough test program implements what you outline. This is for 30,000 32 byte strings (where just the final two bytes differ). On my laptop, output looks like this (edited to only show median duration in each case): Got to be careful

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-09-13 Thread Peter Geoghegan
On Fri, Sep 12, 2014 at 11:38 AM, Robert Haas wrote: > Based on discussion thus far it seems that there's a possibility that > the trade-off may be different for short strings vs. long strings. If > the string is small enough to fit in the L1 CPU cache, then it may be > that memcmp() followed by

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-09-12 Thread Peter Geoghegan
On Fri, Sep 12, 2014 at 12:02 PM, Robert Haas wrote: > I think I've said a few times now that I really want to get this > additional data before forming an opinion. As a certain Mr. Doyle > writes, "It is a capital mistake to theorize before one has data. > Insensibly one begins to twist facts to

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-09-12 Thread Robert Haas
On Fri, Sep 12, 2014 at 2:58 PM, Peter Geoghegan wrote: > On Fri, Sep 12, 2014 at 11:38 AM, Robert Haas wrote: >> Based on discussion thus far it seems that there's a possibility that >> the trade-off may be different for short strings vs. long strings. If >> the string is small enough to fit in

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-09-12 Thread Peter Geoghegan
On Fri, Sep 12, 2014 at 11:38 AM, Robert Haas wrote: > Based on discussion thus far it seems that there's a possibility that > the trade-off may be different for short strings vs. long strings. If > the string is small enough to fit in the L1 CPU cache, then it may be > that memcmp() followed by

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-09-12 Thread Robert Haas
On Fri, Sep 12, 2014 at 5:28 AM, Heikki Linnakangas wrote: > On 09/12/2014 12:46 AM, Peter Geoghegan wrote: >> >> On Thu, Sep 11, 2014 at 1:50 PM, Robert Haas >> wrote: >>> >>> I think I said pretty clearly that it was. >> >> >> I agree that you did, but it wasn't clear exactly what factors you >

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-09-12 Thread Heikki Linnakangas
On 09/12/2014 12:46 AM, Peter Geoghegan wrote: On Thu, Sep 11, 2014 at 1:50 PM, Robert Haas wrote: I think I said pretty clearly that it was. I agree that you did, but it wasn't clear exactly what factors you were asking me to simulate. All factors. Do you want me to compare the same stri

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-09-11 Thread Peter Geoghegan
On Tue, Sep 9, 2014 at 2:25 PM, Robert Haas wrote: >> I like that I don't have to care about every combination, and can >> treat abbreviation abortion as the special case with the extra step, >> in line with how I think of the optimization conceptually. Does that >> make sense? > > No. comparetup

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-09-11 Thread Peter Geoghegan
On Thu, Sep 11, 2014 at 1:50 PM, Robert Haas wrote: > I think I said pretty clearly that it was. I agree that you did, but it wasn't clear exactly what factors you were asking me to simulate. It still isn't. Do you want me to compare the same string a million times in a loop, both with a strcoll(

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-09-11 Thread Robert Haas
On Thu, Sep 11, 2014 at 4:13 PM, Peter Geoghegan wrote: > On Wed, Sep 10, 2014 at 11:36 AM, Robert Haas wrote: >> No, not really. All you have to do is right a little test program to >> gather the information. > > I don't think a little test program is useful - IMV it's too much of a > simplific

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-09-11 Thread Peter Geoghegan
On Wed, Sep 10, 2014 at 11:36 AM, Robert Haas wrote: > No, not really. All you have to do is right a little test program to > gather the information. I don't think a little test program is useful - IMV it's too much of a simplification to suppose that a strcoll() has a fixed cost, and a memcmp()

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-09-10 Thread Robert Haas
On Wed, Sep 10, 2014 at 1:36 PM, Peter Geoghegan wrote: >> In order to know how much we're >> giving up in that case, we need the exact number I asked you to >> provide in my previous email: the ratio of the cost of strcoll() to >> the cost of memcmp(). >> >> I see that you haven't chosen to provi

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-09-10 Thread Peter Geoghegan
On Tue, Sep 9, 2014 at 2:00 PM, Robert Haas wrote: > Boiled down, what you're saying is that you might have a set that > contains lots of duplicates in general, but not very many where the > abbreviated-keys also match. Sure, that's true. Abbreviated keys are not used in the case where we do a (

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-09-09 Thread Robert Haas
On Fri, Sep 5, 2014 at 10:45 PM, Peter Geoghegan wrote: > While I gave serious consideration to your idea of having a dedicated > abbreviation comparator, and not duplicating sortsupport state when > abbreviated keys are used (going so far as to almost fully implement > the idea), I ultimately dec

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-09-09 Thread Robert Haas
On Thu, Sep 4, 2014 at 5:46 PM, Peter Geoghegan wrote: > On Thu, Sep 4, 2014 at 2:18 PM, Robert Haas wrote: >> Eh, maybe? I'm not sure why the case where we're using abbreviated >> keys should be different than the case we're not. In either case this >> is a straightforward trade-off: if we do

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-09-06 Thread Peter Geoghegan
On Sat, Sep 6, 2014 at 3:01 PM, Peter Geoghegan wrote: > I attach another amendment/delta patch Attached is another amendment to the patch set. With the recent addition of abbreviation support on 32-bit platforms, we should just hash the Datum representation as a uint32 on SIZEOF_DATUM != 8 platf

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-09-06 Thread Peter Geoghegan
On Fri, Sep 5, 2014 at 7:45 PM, Peter Geoghegan wrote: > Attached additional patches are intended to be applied on top off most > of the patches posted on September 2nd [1]. I attach another amendment/delta patch, intended to be applied on top of what was posted yesterday. I neglected to remove

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-09-05 Thread Peter Geoghegan
On Wed, Sep 3, 2014 at 2:44 PM, Peter Geoghegan wrote: > I guess it should still be a configure option, then. Or maybe there > should just be a USE_ABBREV_KEYS macro within pg_config_manual.h. Attached additional patches are intended to be applied on top off most of the patches posted on Septembe

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-09-04 Thread Peter Geoghegan
On Thu, Sep 4, 2014 at 5:07 PM, Peter Geoghegan wrote: > So I came up with what I imagined to be an unsympathetic case: BTW, this "cities" data is still available from: http://postgres-benchmarks.s3-website-us-east-1.amazonaws.com/data/cities.dump -- Peter Geoghegan -- Sent via pgsql-hacker

  1   2   3   >