subject:"Re\: \[HACKERS\] sortsupport for text"

Re: [HACKERS] sortsupport for text

2012-10-09 Thread Dimitri Fontaine

Peter Geoghegan pe...@2ndquadrant.com writes:
 On 8 October 2012 21:35, Robert Haas robertmh...@gmail.com wrote:

Gentlemen, you can't fight here. This is the War Room.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sortsupport for text

2012-10-08 Thread Peter Geoghegan

Do you think you can follow through on this soon, Robert? I don't
believe that there are any outstanding issues. I'm not going to make
an issue of the fact that strxfrm() hasn't been taken advantage of. If
you could please post a new revision, with the suggested alterations
(that you agree with), I'd be happy to take another look.

-- 
Peter Geoghegan   http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sortsupport for text

2012-10-08 Thread Robert Haas

On Mon, Oct 8, 2012 at 8:47 AM, Peter Geoghegan pe...@2ndquadrant.com wrote:
 Do you think you can follow through on this soon, Robert? I don't
 believe that there are any outstanding issues. I'm not going to make
 an issue of the fact that strxfrm() hasn't been taken advantage of. If
 you could please post a new revision, with the suggested alterations
 (that you agree with), I'd be happy to take another look.

I don't have any plans to work on this further.  I think all of the
criticism that has been leveled at this patch is 100% bogus, and I
greatly dislike the changes that have been proposed.  That may not be
fair, but it's how I feel, and in light of that feeling it does not
make sense for me to pursue this.  Sorry.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sortsupport for text

2012-10-08 Thread Peter Geoghegan

On 8 October 2012 16:30, Robert Haas robertmh...@gmail.com wrote:
 I don't have any plans to work on this further.  I think all of the
 criticism that has been leveled at this patch is 100% bogus, and I
 greatly dislike the changes that have been proposed.  That may not be
 fair, but it's how I feel, and in light of that feeling it does not
 make sense for me to pursue this.  Sorry.

If it was the case that you were only 50% of the way to getting
something committable, I guess I'd understand; this is, after all, a
volunteer effort, and you are of course free to pursue or not pursue
whatever you like. It's more like 95% though, so I have a hard time
understanding this attitude.

The only criticism that I imagine that you could be referring to is my
criticism (which was seconded by Tom) concerning the approach to
sizing a buffer that is used as state between successive calls to the
sortsupport routine. This was a fairly trivial point even within the
context of this patch. It was regrettable that it got so out of hand,
but that certainly wasn't my intention.

I must say that I find this disheartening as a reviewer, particularly
since it comes from someone who reviews a lot of patches (including
most of my own), and has often complained about the difficulties that
reviewers face. Is it really so hard for you to accept a criticism
from me? I didn't expect you to fully accept my criticism; I only
expected you to take it into consideration, and possibly let it go for
the sake of progress.

For what it's worth, I have high regard for your work generally. In
all sincerity, this appears to me to be a slight, and I find it
upsetting, both on a personal level, and because it creates a concern
about being able to work with you in the future, which is certainly
something that I've benefited from in the past, and hope to continue
to benefit from.

-- 
Peter Geoghegan   http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sortsupport for text

2012-10-08 Thread Robert Haas

On Mon, Oct 8, 2012 at 12:26 PM, Peter Geoghegan pe...@2ndquadrant.com wrote:
 If it was the case that you were only 50% of the way to getting
 something committable, I guess I'd understand; this is, after all, a
 volunteer effort, and you are of course free to pursue or not pursue
 whatever you like. It's more like 95% though, so I have a hard time
 understanding this attitude.

 The only criticism that I imagine that you could be referring to is my
 criticism (which was seconded by Tom) concerning the approach to
 sizing a buffer that is used as state between successive calls to the
 sortsupport routine. This was a fairly trivial point even within the
 context of this patch. It was regrettable that it got so out of hand,
 but that certainly wasn't my intention.

 I must say that I find this disheartening as a reviewer, particularly
 since it comes from someone who reviews a lot of patches (including
 most of my own), and has often complained about the difficulties that
 reviewers face. Is it really so hard for you to accept a criticism
 from me? I didn't expect you to fully accept my criticism; I only
 expected you to take it into consideration, and possibly let it go for
 the sake of progress.

 For what it's worth, I have high regard for your work generally. In
 all sincerity, this appears to me to be a slight, and I find it
 upsetting, both on a personal level, and because it creates a concern
 about being able to work with you in the future, which is certainly
 something that I've benefited from in the past, and hope to continue
 to benefit from.

Hey, if me deciding I don't want to work on a patch any more is going
to make you feel slighted, then you're out of luck.  The archives are
littered with people who have decided to stop working on things
because the consensus position on list was different than the approach
that they personally favored, and I have as much right to do that as
anyone else.  I think that you are perfectly entitled to feel
disappointed that I don't like your suggestions, even as I am
disappointed that the patch encountered so much resistance.  But let's
not turn it into a personal issue.  I don't think that you're a bad
person because you don't like my approach, and I hope you don't think
I'm a bad person because I don't like yours.  If you do, then that's
just something we're both going to have to live with, because I am not
going to make a policy of working on things because other people will
be offended if I don't.

As to what the substantive issue is, well, yeah, I don't like the
memory allocation strategy that you and Tom are proposing.  In the
worst case it chews up a lot of extra memory, and in the best case
there's no measurable performance benefit - or at least nobody done
any work to demonstrate that there is one.  I've already pointed out
that, in fact, the strategy of doing power-of-two allocations is
rarely followed for very large allocations, a point to which I believe
no one has responded substantively.  Now I don't think anyone else is
required to respond to that point, but neither am I required to revise
the patch into a form that I don't agree with.  Equally, I will not
presume to commit it in a form that you and Tom do not agree with.
That would be asinine, and I try (not always successfully) not to be
an ass.  It is not as if anyone has phrased this as a maybe-we-should
sort of argument; there have been quite definite arguments on both
sides over apparently strongly-held positions.  I had hoped that this
was going to be a quick and non-controversial win, but 74 email
messages later it has become clear that it will be neither of those
things.

I do not have a problem with accepting review feedback from you or
from anyone else.  There are many examples in the archives of cases
where I have improved my code based on review feedback; indeed, the
possibility of getting high-quality feedback from the very smart
developers who inhabit this mailing list is for me a principle
attraction of working on PostgreSQL, not to mention a major reason why
PostgreSQL is as good a product as it is.  However, that general truth
does not oblige me to accept any particular piece of feedback as
well-founded, and this is hardly the first suggestion that I have
argued against in four years as a PostgreSQL developer, nor the first
of my patches to land on the cutting-room floor.  Nor do all of my
suggestions for other people's patches get adopted either.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sortsupport for text

2012-10-08 Thread Peter Geoghegan

On 8 October 2012 21:35, Robert Haas robertmh...@gmail.com wrote:
 Hey, if me deciding I don't want to work on a patch any more is going
 to make you feel slighted, then you're out of luck.  The archives are
 littered with people who have decided to stop working on things
 because the consensus position on list was different than the approach
 that they personally favored, and I have as much right to do that as
 anyone else.

Sure you do. However, I doubt very many of those who gave up did so
over so trivial an issue as to how to grow a buffer somewhere, and
those that did usually did not go on to become major contributors, and
certainly not committers. The buffer thing is, as far as I know, the
single solitary point of contention with this patch. We're talking
about 2 lines of code. To give up now is just petulant. There is no
other way of looking at it.

 It is not as if anyone has phrased this as a maybe-we-should
 sort of argument; there have been quite definite arguments on both
 sides over apparently strongly-held positions.  I had hoped that this
 was going to be a quick and non-controversial win, but 74 email
 messages later it has become clear that it will be neither of those
 things.

Many of those 74 emails concerned my completely unrelated digression
into exploiting strxfrm(); we spent a ridiculously long time
discussing how to size this buffer, but it still wasn't anything like
74 messages.

-- 
Peter Geoghegan   http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sortsupport for text

2012-07-23 Thread Robert Haas

On Fri, Jun 15, 2012 at 6:10 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 I would be concerned about this if it were per-sort-tuple wastage, but
 what I understood to be happening was that this was a single instance of
 an expansible buffer (per sort, perhaps, but still just one buffer).
 And, as you keep pointing out when it suits your argument, it's
 relatively uncommon to be sorting enormous values anyway.  So no, I am
 not particularly worried about that.  If you are, there are more
 important places to be concerned about allocation pad wastage, starting
 with palloc itself.

And, of course, palloc doesn't use power-of-two allocation up to
infinity either, as proposed here; beyond a certain limit it switches
to calling malloc(), which doesn't use power-of-two allocations for
large requests either, at least not in glibc - for anything 128kB or
above, it uses mmap to grab something of exactly the requested size
(rounded up to the OS page size) and returns it the OS unconditionally
when it's freed.

However, what this really boils down to is that you and Peter don't
like this line of code:

+   tss-buflen1 = TYPEALIGN(TEXTBUFLEN, len1);

What would you like it to say instead?

The obvious formulation is:

while (len1  tss-buflen1)
tss-buflen *= 2;

Or perhaps the following, which will normally be more efficient,
though possibly not as efficient as what I've got there now:

tss-buflen = 1  ffs(len1);

Of course if len1 happens to be large enough that the high-order bit
is set, the first proposal above will fail to terminate and the second
one will produce 0.  That can't really happen in practice because a
toasted datum is limited to 1GB, of course, and I suppose a lot of
code would need to be adjusted if we ever wanted to raise that limit,
so maybe it's not worth worrying about it.

So, is one of the above code fragments what people want?  Or something else?

Thanks,

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sortsupport for text

2012-07-23 Thread Peter Geoghegan

On 23 July 2012 16:09, Robert Haas robertmh...@gmail.com wrote:
 However, what this really boils down to is that you and Peter don't
 like this line of code:

 +   tss-buflen1 = TYPEALIGN(TEXTBUFLEN, len1);

I can only speak for myself, though I agree with your summary here.

 What would you like it to say instead?

 The obvious formulation is:

 while (len1  tss-buflen1)
 tss-buflen *= 2;

That's what I had in mind. +1.

 Or perhaps the following, which will normally be more efficient,
 though possibly not as efficient as what I've got there now:

 tss-buflen = 1  ffs(len1);

I'm sorry, I don't follow you here. What is ffs() ?

-- 
Peter Geoghegan   http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sortsupport for text

2012-07-23 Thread Robert Haas

On Mon, Jul 23, 2012 at 11:34 AM, Peter Geoghegan pe...@2ndquadrant.com wrote:
 On 23 July 2012 16:09, Robert Haas robertmh...@gmail.com wrote:
 However, what this really boils down to is that you and Peter don't
 like this line of code:

 +   tss-buflen1 = TYPEALIGN(TEXTBUFLEN, len1);

 I can only speak for myself, though I agree with your summary here.

 What would you like it to say instead?

 The obvious formulation is:

 while (len1  tss-buflen1)
 tss-buflen *= 2;

 That's what I had in mind. +1.

 Or perhaps the following, which will normally be more efficient,
 though possibly not as efficient as what I've got there now:

 tss-buflen = 1  ffs(len1);

 I'm sorry, I don't follow you here. What is ffs() ?

Sorry, fls, not ffs.  I always get those mixed up.

See src/port/fls.c

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sortsupport for text

2012-07-23 Thread Peter Geoghegan

On 23 July 2012 16:36, Robert Haas robertmh...@gmail.com wrote:
 On Mon, Jul 23, 2012 at 11:34 AM, Peter Geoghegan pe...@2ndquadrant.com 
 wrote:
 tss-buflen = 1  ffs(len1);

 I'm sorry, I don't follow you here. What is ffs() ?

 Sorry, fls, not ffs.  I always get those mixed up.

 See src/port/fls.c

Oh, okay. Since, I infer, we're starting from a buffer-size that's a
power-of-two anyway, is there really any advantage in doing this
rather than just doubling the buffer size each time?

-- 
Peter Geoghegan   http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sortsupport for text

2012-07-23 Thread Robert Haas

On Mon, Jul 23, 2012 at 12:07 PM, Peter Geoghegan pe...@2ndquadrant.com wrote:
 On 23 July 2012 16:36, Robert Haas robertmh...@gmail.com wrote:
 On Mon, Jul 23, 2012 at 11:34 AM, Peter Geoghegan pe...@2ndquadrant.com 
 wrote:
 tss-buflen = 1  ffs(len1);

 I'm sorry, I don't follow you here. What is ffs() ?

 Sorry, fls, not ffs.  I always get those mixed up.

 See src/port/fls.c

 Oh, okay. Since, I infer, we're starting from a buffer-size that's a
 power-of-two anyway, is there really any advantage in doing this
 rather than just doubling the buffer size each time?

Well, if you're using a builtin fls rather than our src/port
implementation, it's probably a single machine language instruction
instead of a loop.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sortsupport for text

2012-06-21 Thread Florian Pflug

On Jun20, 2012, at 19:38 , Peter Geoghegan wrote:
 On 20 June 2012 17:41, Tom Lane t...@sss.pgh.pa.us wrote:
 In any case, if you have to redefine the meaning of equality in order
 to justify a performance patch, I'm prepared to walk away at the start.
 
 The advantage of my proposed implementation is precisely that I won't
 have to redefine the meaning of equality, and that only the text
 datatype will have to care about equivalency, so you can just skip
 over an explanation of equivalency for most audiences.

Ok, so what exactly is it that you're proposing then? AFAICS, you're
proposing to make the less-than operator rely solely on strcol(). If you
don't also redefine the quality operator to match, you'll end up with
cases where !(a  b) and !(b  a), yet also !(a == b). This breaks the
trichotomy law, no? Which in turns breaks merge-joins - I believe we'd
output the cartesian product {potty, potyty} x {potyty, potty} when
merge-joining {potty, potyty} with itself, unless the comparison operator
contains a tie-breaker. Which is obviously wrong unless you decree
that 'potty' = 'potyty'.

I do agree that there's a use-case for having a textual type which
treats equivalent strings as equal (and which should then also treat
equivalent Unicode representations of the same character as equal). But
it should be a separate type, not bolted onto the existing textual types.

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sortsupport for text

2012-06-21 Thread Florian Pflug

On Jun21, 2012, at 02:22 , Peter Geoghegan wrote:
 I've written a very small C++ program, which I've attached, that
 basically proves that this can still make a fairly large difference -
 I hope it's okay that that it's C++, but that allowed me to write the
 program quickly, with no dependencies for anyone who cares to try this
 out, other than a C++ compiler and the standard library.

Uh, are you sure the results aren't swapped? string_wrapper takes a
parameter called use_strcoll, and it indeed uses strcoll() if that
parameter is true, and strxfm() otherwise. But then it passes *false*
to determine the speed of strcoll(), and *true* to determine the speed
of strxfm()...

Also, place_random_string() needs to insert a terminating zero byte,
otherwise set_test segfaults, at least on my OSX machine.

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sortsupport for text

2012-06-21 Thread Peter Geoghegan

On 21 June 2012 10:24, Florian Pflug f...@phlo.org wrote:
 On Jun21, 2012, at 02:22 , Peter Geoghegan wrote:
 I've written a very small C++ program, which I've attached, that
 basically proves that this can still make a fairly large difference -
 I hope it's okay that that it's C++, but that allowed me to write the
 program quickly, with no dependencies for anyone who cares to try this
 out, other than a C++ compiler and the standard library.

 Uh, are you sure the results aren't swapped? string_wrapper takes a
 parameter called use_strcoll, and it indeed uses strcoll() if that
 parameter is true, and strxfm() otherwise. But then it passes *false*
 to determine the speed of strcoll(), and *true* to determine the speed
 of strxfm()...

Egads, you're right. Apparently in my haste I gave the wrong boolean
argument to the class constructor in each case. I am completely wrong
about this. I apologise for the careless mistake. At least I advanced
the debate, though - we don't want to do any ad-hoc generation of
strxfrm() output within comparators, even when one side can have a
cached value.

You're right about merge-joins; I cannot answer that without
arbitrarily making a convenient exception about the conditions that
they join on, which is itself unacceptable. I cannot change the
semantics of equality to do something with strxfrm either, since that
would have unacceptable performance implications, as is evident from
my test-case. I wish someone had a better idea, but I suppose at this
point the idea of concatenating strxfrm() output with the original
string begins to look like the least-worst option.

-- 
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sortsupport for text

2012-06-21 Thread Florian Pflug

On Jun21, 2012, at 11:55 , Peter Geoghegan wrote:
 On 21 June 2012 10:24, Florian Pflug f...@phlo.org wrote:
 On Jun21, 2012, at 02:22 , Peter Geoghegan wrote:
 I've written a very small C++ program, which I've attached, that
 basically proves that this can still make a fairly large difference -
 I hope it's okay that that it's C++, but that allowed me to write the
 program quickly, with no dependencies for anyone who cares to try this
 out, other than a C++ compiler and the standard library.
 
 Uh, are you sure the results aren't swapped? string_wrapper takes a
 parameter called use_strcoll, and it indeed uses strcoll() if that
 parameter is true, and strxfm() otherwise. But then it passes *false*
 to determine the speed of strcoll(), and *true* to determine the speed
 of strxfm()...
 
 Egads, you're right. Apparently in my haste I gave the wrong boolean
 argument to the class constructor in each case. I am completely wrong
 about this. I apologise for the careless mistake. At least I advanced
 the debate, though - we don't want to do any ad-hoc generation of
 strxfrm() output within comparators, even when one side can have a
 cached value.

FWIW, I've played around a bit with a fixed version of your set_test.cpp.
I made it use the cached sort key of the RHS also, made it preserve sort
keys during copy construction, even used a boost::shared_ptr to avoid
actually copying the cached sort key. The strxfrm() version still performed
about 30% worse than the strcoll() version. 

From that, I figured that tree inserts don't trigger enough comparisons to
make strxfrm() worthwhile. So I mangled set_test.cpp a bit more and instead
of a set::set, I created an (C-style) array of string_wrapper instances,
and sorted them with std::sort(). Which made strcoll() twice as fast as
strxfrm()...

At this point, my theory is that your choice of random strings prevents
strxfrm() from ever winning over strcoll(). The reason being that you pick
each letter uniformly distributed from a-z, resulting in a probability of
two string differing in the first of 1 - 1/26 =~ 96%. Thus, even for
extremely complex collation rules, strcoll() probably only needs to compare
a few characters to determine the order. Whereas strxfrm() has to compute
the whole sort key, no matter what.

The question is thus, how good a model are your random strings for the
input of a typical sorting step in postgres? My guess is, a quite good one
actually, since people probably don't deal with a lot of very similar strings
very often. Which makes we wonder if using strxfrm() during sorting wouldn't
be a net loss, all things considered…

My modified set_test.cpp (which should now be called array_test.cpp)
is attached.

best regards,
Florian Pflug


set_test.cpp
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sortsupport for text

2012-06-21 Thread Peter Geoghegan

On 21 June 2012 11:40, Florian Pflug f...@phlo.org wrote:
 At this point, my theory is that your choice of random strings prevents
 strxfrm() from ever winning over strcoll(). The reason being that you pick
 each letter uniformly distributed from a-z, resulting in a probability of
 two string differing in the first of 1 - 1/26 =~ 96%. Thus, even for
 extremely complex collation rules, strcoll() probably only needs to compare
 a few characters to determine the order. Whereas strxfrm() has to compute
 the whole sort key, no matter what.

Good point.

 The question is thus, how good a model are your random strings for the
 input of a typical sorting step in postgres? My guess is, a quite good one
 actually, since people probably don't deal with a lot of very similar strings
 very often. Which makes we wonder if using strxfrm() during sorting wouldn't
 be a net loss, all things considered…

Well, I think the answer to that has to be no. I posted a simple
postgres text - strxfrm blob bytea SQL-callable wrapper function a
few days ago that clearly wins by quite a bit on some sample queries
against the dellstore sample database, which has a representative
sample set. Completely uniformly-distributed data actually isn't
representative of the real world at all. Normal distributions abound.

This C++ program was never intended to justify the general utility of
using strxfrm() for sorting, which I don't believe is in question. I
just wanted to get some idea about the performance characteristics of
using strxfrm() to traverse an index.

-- 
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sortsupport for text

2012-06-20 Thread Peter Eisentraut

On sön, 2012-06-17 at 23:58 +0100, Peter Geoghegan wrote:
 So if you take the word Aßlar here - that is equivalent to Asslar,
 and so strcoll(Aßlar, Asslar) will return 0 if you have the right
 LC_COLLATE

This is not actually correct.  glibc will sort Asslar before Aßlar, and
that is correct in my mind.

When a Wikipedia page on some particular language's alphabet says
something like $letterA and $letterB are equivalent, what it really
means is that they are sorted the same compared to other letters, but
are distinct when ties are broken.

  (if you tried this out for yourself and found that I was
 actually lying through my teeth, pretend I said Hungarian instead of
 German and some really obscure character rather than ß).

Yeah, there are obviously exceptions, which led to the original change
being made, but they are not as wide-spread as they appear to be.

The real issue in this area, I suspect, will be dealing with Unicode
combining sequences versus equivalent precombined characters.  But
support for that is generally crappy, so it's not urgent to deal with
it.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sortsupport for text

2012-06-20 Thread Peter Geoghegan

On 20 June 2012 11:00, Peter Eisentraut pete...@gmx.net wrote:
 On sön, 2012-06-17 at 23:58 +0100, Peter Geoghegan wrote:
 So if you take the word Aßlar here - that is equivalent to Asslar,
 and so strcoll(Aßlar, Asslar) will return 0 if you have the right
 LC_COLLATE

 This is not actually correct.  glibc will sort Asslar before Aßlar, and
 that is correct in my mind.

Uh, what happened here was that I assumed that it was correct, and
then went to verify it and found that it wasn't before sending the
mail, and couldn't immediately find any hard data about what
characters this did apply to, I decided to turn it into a joke. I say
this, and yet you've included that bit of the e-mail inline in your
reply, so maybe it just wasn't a very good joke.

 When a Wikipedia page on some particular language's alphabet says
 something like $letterA and $letterB are equivalent, what it really
 means is that they are sorted the same compared to other letters, but
 are distinct when ties are broken.

I know.

  (if you tried this out for yourself and found that I was
 actually lying through my teeth, pretend I said Hungarian instead of
 German and some really obscure character rather than ß).

 Yeah, there are obviously exceptions, which led to the original change
 being made, but they are not as wide-spread as they appear to be.

True.

 The real issue in this area, I suspect, will be dealing with Unicode
 combining sequences versus equivalent precombined characters.  But
 support for that is generally crappy, so it's not urgent to deal with
 it.

I agree that it isn't urgent. However, I have an ulterior motive,
which is that in allowing for this, we remove the need to strcmp()
after each strcoll(), and consequently it becomes possible to use
strxfrm() instead. Now, we could also use a hack that's going to make
the strxfrm() blobs even bulkier still (basically, concatenate the
original text to the blob before strcmp()), but I don't want to go
there if it can possibly be avoided.

I should also point out that we pride ourselves on following the
letter of the standard when that makes sense, and we are currently not
doing that in respect of the Unicode standard.

-- 
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sortsupport for text

2012-06-20 Thread Peter Geoghegan

On 19 June 2012 19:44, Peter Geoghegan pe...@2ndquadrant.com wrote:
 PostgreSQL supported Unicode before 2005, when the tie-breaker was
 introduced. I know at least one Swede who used Postgres95. I just took
 a look at the REL6_4 branch, and it looks much the same in 1999 as it
 did in 2005, in that there is no tie-breaker after the strcoll(). Now,
 that being the case, and Hungarian in particular having a whole bunch
 of these equivalencies, I have to wonder if the original complainant's
 problem really was diagnosed correctly. It could of had something to
 do with the fact that texteq() was confused about whether it reported
 equality or equivalency - it may have taken that long for the (len1 !=
 len2) fastpath thing (only holds for equality, not equivalence,
 despite the fact that the 2005-era strcoll() call checks equivalence
 within texteq() ) to trip someone out, because texteq() would have
 thereby given inconsistent answers in a very subtle way, that were not
 correct either according to the Hungarian locale, nor according to
 simple bitwise equality.

It seems likely that this is more-or-less correct. The two equivalent
strings had a variable number of characters, so the fastpath made
texteq not accord with varstr_cmp(), even though texteq() itself also
only had a single strcoll() call.

So there was some tuples with the hungarian string potty in the 2005
bug report. They were not visible for any of the queries seen in test
cases, even the good ones. There were a few tuples with equivalent
strings like potyty that were visible.

The index scans didn't fail to return the expected tuples because the
indexes were internally inconsistent or otherwise corrupt. Rather, the
_bt_checkkeys() function returned early because the faulty half
equivalence, half equality texteq() comparator reported that the
tuple returned didn't satisfy the qual, and on that basis the index
scan stopped. This usually wouldn't happen with Swedish, because their
equivalencies tend to be one character long, and are less common.

So far, so good, but how did this not blow-up sooner? Did Hungarians
only start using Postgres in late 2005, immediately after the 8.1
release? Hardly.

Commit c1d62bfd00f4d1ea0647e12947ca1de9fea39b33, made in late 2003,
Add operator strategy and comparison-value datatype fields to
ScanKey, may be part of the problem here.

Consider this test within _bt_checkkeys(), that was changed by that commit:

-   if (key-sk_flags  SK_COMMUTE)
-   test = FunctionCall2(key-sk_func,
-key-sk_argument, datum);
-   else
-   test = FunctionCall2(key-sk_func,
-datum, key-sk_argument);
+   test = FunctionCall2(key-sk_func, datum, key-sk_argument);

-   if (DatumGetBool(test) == !!(key-sk_flags  SK_NEGATE))
+   if (!DatumGetBool(test))

I think that this change may have made the difference between the
Hungarians getting away with it and not getting away with it. Might it
have been that for text, they were using some operator that wasn't '='
(perhaps one which has no fastpath, and thus correctly made a
representation about equivalency) rather than texteq prior to this
commit? I didn't eyeball the pg_amop entries of the era myself, but it
seems quite possible. In any case, I find it hard to believe that it
took at least ten years for this problem to manifest itself just
because it took that long for a Hungarian with a strcoll()
implementation that correctly represented equivalency to use Postgres.
A year is a more plausible window.

If we do introduce an idea of equivalency to make all this work, that
means there'll have to be equivalency verification when equality
verification returned false in a number of places, including the
above. For Gist, there is an equivalent test will still vary based on
the strategy number used dubbed the consistent function, which seems
analogous to the above.

So, you're going to have an extra strcoll()/strxfrm() + strcmp() here,
as part of a not-equal-but-maybe-equivalent test, which is bad.
However, if that means that we can cache a text constant as a
strxfrm() blob, and compare in a strxfrm()-wise fashion, that will
more than pay for itself, even for btree traversal alone. For a naive
strxfrm() + strcoll() implementation, that will save just under half
of the work, and everyone knows that the cost of the comparison is
what dominates here, particularly for certain collations.

We'd probably formalise it to the point where there'd be a btree
strategy number and fully-fledged equivalency operator that the user
could conceivably use themselves.

There seems to be scope-creep here. I'm not sure that I should
continue with this as part of this review. Maybe this should be
something that I work on for the next commitfest.

It would be nice to hear what others thought of these ideas before I
actually start writing a patch that both fixes these problems (our
behaviour is incorrect for some locales

Re: [HACKERS] sortsupport for text

2012-06-20 Thread Greg Stark

On Sun, Jun 17, 2012 at 9:26 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 The trick for hashing such datatypes is to be able to guarantee that
 equal values hash to the same hash code, which is typically possible
 as long as you know the equality rules well enough.  We could possibly
 do that for text with pure-strcoll equality if we knew all the details
 of what strcoll would consider equal, but we do not.

It occurs to me that strxfrm would answer this question. If we made
the hash function hash the result of strxfrm then we could make
equality use strcoll and not fall back to strcmp.

I'm suspect in a green field that's what we would do though the cpu
cost might be enough to think hard about it. I'm not sure it's worth
considering switching though.

The cases where it matters to users incidentally is when you have a
multi-column sort order and have values that are supposed to sort
equal in the first column but print differently. Given that there
seems to be some controversy in the locale definitions -- most locals
seem to use insignificant factors like accents or ligatures as
tie-breakers and avoid claiming different sequences are equal even
when the language usually treats them as equivalent -- it doesn't seem
super important to maintain the property for the few locales that fall
the other way. Unless my impression is wrong and there's a good
principled reason why some locales treat nearly equivalent strings one
way and some treat them the other.

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sortsupport for text

2012-06-20 Thread Peter Geoghegan

On 20 June 2012 15:10, Greg Stark st...@mit.edu wrote:
 On Sun, Jun 17, 2012 at 9:26 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 The trick for hashing such datatypes is to be able to guarantee that
 equal values hash to the same hash code, which is typically possible
 as long as you know the equality rules well enough.  We could possibly
 do that for text with pure-strcoll equality if we knew all the details
 of what strcoll would consider equal, but we do not.

 It occurs to me that strxfrm would answer this question. If we made
 the hash function hash the result of strxfrm then we could make
 equality use strcoll and not fall back to strcmp.

What about per-column collations?

-- 
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sortsupport for text

2012-06-20 Thread Greg Stark

On Wed, Jun 20, 2012 at 3:19 PM, Peter Geoghegan pe...@2ndquadrant.com wrote:
 It occurs to me that strxfrm would answer this question. If we made
 the hash function hash the result of strxfrm then we could make
 equality use strcoll and not fall back to strcmp.

 What about per-column collations?

Well collations aren't really per-column, they're specific to the
comparison in the expression at hand. The per-column collation is just
the default for comparisons against that column.

So for a hash join for example you would use build the hash table
using strxfrm for the collation being used for the join  expression.
For hash indexes the index would only be valid for a specific
collation, just like a btree index is.

But this all seems like a lot of work for a case that most locales
seem to think isn't worth worrying about.

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sortsupport for text

2012-06-20 Thread Tom Lane

Peter Geoghegan pe...@2ndquadrant.com writes:
 I think that this change may have made the difference between the
 Hungarians getting away with it and not getting away with it. Might it
 have been that for text, they were using some operator that wasn't '='
 (perhaps one which has no fastpath, and thus correctly made a
 representation about equivalency) rather than texteq prior to this
 commit?

Uh, no.  There aren't any magic variants of equality now, and there were
not back then either.  I'm inclined to think that the Hungarian
problem did exist long before we fixed it.

 So, you're going to have an extra strcoll()/strxfrm() + strcmp() here,
 as part of a not-equal-but-maybe-equivalent test, which is bad.
 However, if that means that we can cache a text constant as a
 strxfrm() blob, and compare in a strxfrm()-wise fashion, that will
 more than pay for itself, even for btree traversal alone.

Um ... are you proposing to replace text btree index entries with
strxfrm values?  Because if you aren't, I don't see that this is likely
to win anything.  And if you are, it loses on the grounds of (a) index
bloat and (b) loss of ability to do index-only scans.

 It would be nice to hear what others thought of these ideas before I
 actually start writing a patch that both fixes these problems (our
 behaviour is incorrect for some locales according to the Unicode
 standard),  facilitates a strxfrm() optimisation, and actually adds a
 strxfrm() optimisation.

Personally I think this is not a direction we want to go in.  I think
that it's going to end up a significant performance loss in many cases,
break backwards compatibility in numerous ways, and provide a useful
behavioral improvement to only a small minority of users.  If you check
the archives, we get far more complaints from people who think their
locale definition is wrong than from people who are unhappy because
we don't hew exactly to the corner cases of their locale.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sortsupport for text

2012-06-20 Thread Peter Geoghegan

On 20 June 2012 15:55, Tom Lane t...@sss.pgh.pa.us wrote:
 Peter Geoghegan pe...@2ndquadrant.com writes:
 I think that this change may have made the difference between the
 Hungarians getting away with it and not getting away with it. Might it
 have been that for text, they were using some operator that wasn't '='
 (perhaps one which has no fastpath, and thus correctly made a
 representation about equivalency) rather than texteq prior to this
 commit?

 Uh, no.  There aren't any magic variants of equality now, and there were
 not back then either.  I'm inclined to think that the Hungarian
 problem did exist long before we fixed it.

I was suggesting that an operator other than equality may have been
used for some reason. It probably isn't a significant issue though.

 Um ... are you proposing to replace text btree index entries with
 strxfrm values?  Because if you aren't, I don't see that this is likely
 to win anything.  And if you are, it loses on the grounds of (a) index
 bloat and (b) loss of ability to do index-only scans.

No, I'm suggesting it would probably be at least a bit of a win here
to cache the constant, and only have to do a strxfrm() + strcmp() per
comparison. Not enough to justify all the added complexity of course,
but I wouldn't seek to justify this on the basis of improvements to
the speed of btree traversal.  It would obviously be much more
valuable for tuple sorting.

 Personally I think this is not a direction we want to go in.  I think
 that it's going to end up a significant performance loss in many cases,
 break backwards compatibility in numerous ways, and provide a useful
 behavioral improvement to only a small minority of users.

I may have over-emphasized the improvement in correctness that this
would bring, which I personally consider to be very much a secondary
benefit. The fact is that this is likely to be a fairly significant
performance win, because strxfrm() is quite simply the way you're
supposed to do collation-aware sorting, and is documented as such. For
that reason, C standard library implementations should not be expected
to emphasize its performance - they assume that you're using strxfrm()
+ their highly optimised strcmp() (as I've said, the glibc
implementation is written in ASM, and makes use of hand-written SSSE3
instructions on x86_64 for example).

The only performance downside that I can think of is the added check
for equivalence for each tuple within _bt_checkkeys() - perhaps you
were thinking of something else that hasn't occurred to me though.
Were you?

The added overhead is only going to be paid only once per index scan,
and not once per tuple returned by an index scan. Since equality
implies equivalence for us, there is no need to check equivalence
until something is unequal, and when that happens, and equivalence
doesn't hold, we're done. Meanwhile, that entire index scan just got
measurably faster from caching the constant's strxfrm() blob.

Now, maybe it is a bit funky that this equivalence test has to happen
even though the vast majority of locales don't care. If the only
alternative is to bloat up the strxfrm() blobs with the original
string, I'd judge that the funkiness is well worth it.

-- 
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sortsupport for text

2012-06-20 Thread Tom Lane

Peter Geoghegan pe...@2ndquadrant.com writes:
 No, I'm suggesting it would probably be at least a bit of a win here
 to cache the constant, and only have to do a strxfrm() + strcmp() per
 comparison.

Um, have you got any hard evidence to support that notion?  The
traditional advice is that strcoll is faster than using strxfrm unless
the same strings are to be compared repeatedly.  I'm not convinced that
saving the strxfrm on just one side will swing the balance.

 The fact is that this is likely to be a fairly significant
 performance win, because strxfrm() is quite simply the way you're
 supposed to do collation-aware sorting, and is documented as such. For
 that reason, C standard library implementations should not be expected
 to emphasize its performance - they assume that you're using strxfrm()
 + their highly optimised strcmp()

Have you got any evidence in support of this claim, or is it just
wishful thinking about what's likely to be inside libc?  I'd also note
that any comparisons you may have seen about this are certainly not
accounting for the effects of data bloat from strxfrm (ie, possible
spill to disk, more merge passes, etc).

In any case, if you have to redefine the meaning of equality in order
to justify a performance patch, I'm prepared to walk away at the start.
The range of likely performance costs/benefits across different locales
and different implementations is so wide that if you can't show it to be
a win even with the strcmp tiebreaker, it's not likely to be a reliable
win without that.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sortsupport for text

2012-06-20 Thread Robert Haas

On Wed, Jun 20, 2012 at 12:41 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 The fact is that this is likely to be a fairly significant
 performance win, because strxfrm() is quite simply the way you're
 supposed to do collation-aware sorting, and is documented as such. For
 that reason, C standard library implementations should not be expected
 to emphasize its performance - they assume that you're using strxfrm()
 + their highly optimised strcmp()

 Have you got any evidence in support of this claim, or is it just
 wishful thinking about what's likely to be inside libc?  I'd also note
 that any comparisons you may have seen about this are certainly not
 accounting for the effects of data bloat from strxfrm (ie, possible
 spill to disk, more merge passes, etc).

 In any case, if you have to redefine the meaning of equality in order
 to justify a performance patch, I'm prepared to walk away at the start.
 The range of likely performance costs/benefits across different locales
 and different implementations is so wide that if you can't show it to be
 a win even with the strcmp tiebreaker, it's not likely to be a reliable
 win without that.

On the testing I've done, the strcmp() tie-breaking rarely gets run
anyway.  Unless you're sorting data with only a few distinct values,
most comparisons are between values that are distinct under any choice
of collation, and therefore strcoll() returns 0 very rarely, and
therefore the additional runtime it consumes does not matter very
much.  Also, it's quite a bit faster than strcoll() anyway, so even
when it does run it doesn't add much to the total time.

I think the elephant in the room here is that we're relying on the OS
to do everything for us, and the OS API we use (strcoll) requires an
extra memcpy and is also dreadfully slow.  If we could solve that
problem, it would save us a lot more than worrying about the extra
strcmp().  Of course, solving that problem is hard: we either have to
get the glibc and FreeBSD libc folks to provide a better API (that
takes lengths for each input instead of relying on trailing nul
bytes), or reimplement locales within PG, or store trailing NUL bytes
that we don't really need in the index so we can apply strcoll
directly, none of which are very appealing.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sortsupport for text

2012-06-20 Thread Peter Geoghegan

On 20 June 2012 17:41, Tom Lane t...@sss.pgh.pa.us wrote:
 Peter Geoghegan pe...@2ndquadrant.com writes:
 No, I'm suggesting it would probably be at least a bit of a win here
 to cache the constant, and only have to do a strxfrm() + strcmp() per
 comparison.

 Um, have you got any hard evidence to support that notion?  The
 traditional advice is that strcoll is faster than using strxfrm unless
 the same strings are to be compared repeatedly.  I'm not convinced that
 saving the strxfrm on just one side will swing the balance.

Fair enough,  but I'd suggest that that traditional advice assumes
that strcoll()'s space-efficiency and general avoidance of dynamic
allocation is an important factor, which, for us, it clearly isn't,
since we can re-use a single buffer in the manner of Robert's text
sortsupport patch for each and every per-tuple strxfrm() blob (when
traversing an index). This is a direct quote from glibc's strcoll_l.c
:

 Perform the first pass over the string and while doing this find
 and store the weights for each character.  Since we want this to
 be as fast as possible we are using `alloca' to store the temporary
 values.  But since there is no limit on the length of the string
 we have to use `malloc' if the string is too long.  We should be
 very conservative here.

Here, alloca is used to allocate space in a stack frame. I believe
that this is an entirely inappropriate trade-off for Postgres to be
making. strxfrm(), in constrast, leaves buffer sizing and management
up to the caller. That has to be a big part of the problem here.

 The fact is that this is likely to be a fairly significant
 performance win, because strxfrm() is quite simply the way you're
 supposed to do collation-aware sorting, and is documented as such. For
 that reason, C standard library implementations should not be expected
 to emphasize its performance - they assume that you're using strxfrm()
 + their highly optimised strcmp()

 Have you got any evidence in support of this claim, or is it just
 wishful thinking about what's likely to be inside libc?

According to the single-unix specification's strcoll() documentation,
The strxfrm() and strcmp() functions should be used for sorting large
lists. If that isn't convincing enough for you, there is the fact
that glibc's strcmp() is clearly highly optimised for each and every
architecture, and that we are currently throwing away an extra
strcmp() in the event of strcoll() equality.

 I'd also note that any comparisons you may have seen about this are certainly 
 not
 accounting for the effects of data bloat from strxfrm (ie, possible
 spill to disk, more merge passes, etc).

What about the fact that strcoll() may be repeatedly allocating and
freeing memory per comparison? The blobs really aren't that much
larger than the strings to be sorted, which are typically quite short.

 In any case, if you have to redefine the meaning of equality in order
 to justify a performance patch, I'm prepared to walk away at the start.

The advantage of my proposed implementation is precisely that I won't
have to redefine the meaning of equality, and that only the text
datatype will have to care about equivalency, so you can just skip
over an explanation of equivalency for most audiences.

If you feel that strongly about it, and I have no possible hope of
getting this accepted, I'm glad that I know now rather than after
completing a significant amount of work on this. I would like to hear
other people's opinions before I drop it though.

 The range of likely performance costs/benefits across different locales
 and different implementations is so wide that if you can't show it to be
 a win even with the strcmp tiebreaker, it's not likely to be a reliable
 win without that.

glibc is the implementation that really matters. My test-case used
en_US.UTF-8 as its locale, which has to be one of the least stressful
to strcoll() - I probably could have shown a larger improvement just
by selecting a locale that was known to have to make more passes,
like, say, hu_HU.UTF-8.

I expect to have access to a 16 core server next week, which Bull have
made available to me. Maybe I'll get some interesting performance
numbers from it.

The reason that I don't want to use the blob with original string hack
is because it's ugly, space-inefficient, unnecessary and objectively
incorrect, since it forces us to violate the conformance requirement
C9 of Unicode 3.0, marginal though that may be.

-- 
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sortsupport for text

2012-06-20 Thread Peter Geoghegan

On 20 June 2012 17:41, Tom Lane t...@sss.pgh.pa.us wrote:
 Peter Geoghegan pe...@2ndquadrant.com writes:
 No, I'm suggesting it would probably be at least a bit of a win here
 to cache the constant, and only have to do a strxfrm() + strcmp() per
 comparison.

 Um, have you got any hard evidence to support that notion?  The
 traditional advice is that strcoll is faster than using strxfrm unless
 the same strings are to be compared repeatedly.  I'm not convinced that
 saving the strxfrm on just one side will swing the balance.

I've written a very small C++ program, which I've attached, that
basically proves that this can still make a fairly large difference -
I hope it's okay that that it's C++, but that allowed me to write the
program quickly, with no dependencies for anyone who cares to try this
out, other than a C++ compiler and the standard library.

It just inserts 500,000 elements (the same succession of psuedo-random
strings again and again) into a std::set, which is implemented using a
red-black tree on my machine. In one case, we just use strcmp() (there
is actually no tie-breaker). In the other case, the comparator
allocates and fills memory for a strxfrm blob when needed, caches it,
and uses strxfrm() to generate blobs for other elements into a
dedicated buffer which persists across comparisons.

It prints the duration of each run, and a running average for each
case until terminated. Here is what the output looks like on my
machine:

[peter@peterlaptop strxfrm_test]$ g++ -O2 set_test.cpp
[peter@peterlaptop strxfrm_test]$ ./a.out
Time elapsed with strcoll: 1.485290 seconds
Time elapsed with strxfrm optimization: 1.070211 seconds
Time elapsed with strcoll: 1.813725 seconds
Time elapsed with strxfrm optimization: 1.097950 seconds
Time elapsed with strcoll: 1.813381 seconds
Time elapsed with strxfrm optimization: 1.102769 seconds
Time elapsed with strcoll: 1.826708 seconds
Time elapsed with strxfrm optimization: 1.105093 seconds
Time elapsed with strcoll: 1.842156 seconds
Time elapsed with strxfrm optimization: 1.103198 seconds
*strcoll average: 1.756252 strxfrm average: 1.095844*
Time elapsed with strcoll: 1.834785 seconds
Time elapsed with strxfrm optimization: 1.105369 seconds
Time elapsed with strcoll: 1.817449 seconds
Time elapsed with strxfrm optimization: 1.110386 seconds
Time elapsed with strcoll: 1.812446 seconds
Time elapsed with strxfrm optimization: 1.101266 seconds
Time elapsed with strcoll: 1.813595 seconds
Time elapsed with strxfrm optimization: 1.099444 seconds
Time elapsed with strcoll: 1.814584 seconds
Time elapsed with strxfrm optimization: 1.099542 seconds
*strcoll average: 1.787412 strxfrm average: 1.099523*
Time elapsed with strcoll: 1.820218 seconds
Time elapsed with strxfrm optimization: 1.102984 seconds
Time elapsed with strcoll: 1.817549 seconds
Time elapsed with strxfrm optimization: 1.100526 seconds
Time elapsed with strcoll: 1.817020 seconds
Time elapsed with strxfrm optimization: 1.099273 seconds
Time elapsed with strcoll: 1.820983 seconds
Time elapsed with strxfrm optimization: 1.100501 seconds
Time elapsed with strcoll: 1.813270 seconds
Time elapsed with strxfrm optimization: 1.098904 seconds
*strcoll average: 1.797544 strxfrm average: 1.099828*
Time elapsed with strcoll: 1.815952 seconds
Time elapsed with strxfrm optimization: 1.102583 seconds

If you'd like to print the contents of the set, there are a couple of
lines you can uncomment. It should be straightforward to understand
the program, even if you don't know any C++.

Note that I still think that the compelling case for strxfrm() caching
is sorting tuples, not btree index traversal. All that I ask is that
you allow that on the whole, this will pay for the cost of verifying
equivalency within _bt_checkkeys() once per index-scan.

I am aware that a red-black tree isn't generally an excellent proxy
for how a btree will perform, but I don't think that that really
matters here.

-- 
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services
/*
 * set_test.cpp:
 *
 * Measure the performance characteristics of using strcoll() as
 * a string comparator rather than caching strxfrm() blobs.
 *
 * Author: Peter Geoghegan
 *
 * A std::set is an ascending container of unique elements. The implementation
 * that I used for any numbers published was the one from Gnu libstdc++, on
 * Fedora 16. That particular implementation uses a red-black tree, but the
 * standard's requirement that common operations (search, insert, delete) take
 * logarithmic time effectively requires the use of some kind of self-balancing
 * binary search tree, which isn't a bad proxy for a btree for my purposes.
 */

#include set
#include stdlib.h
#include stdio.h
#include string.h
#include string
#include sys/time.h

#define ORIG_STRING_SIZE	32
#define BLOB_STRING_SIZE	128

double
timeval_subtract(struct timeval *x, struct timeval *y)
{
	struct timeval result;
	/* Compute the

Re: [HACKERS] sortsupport for text

2012-06-20 Thread Peter Geoghegan

On 21 June 2012 01:22, Peter Geoghegan pe...@2ndquadrant.com wrote:
 I've written a very small C++ program, which I've attached, that
 basically proves that this can still make a fairly large difference -
 I hope it's okay that that it's C++, but that allowed me to write the
 program quickly, with no dependencies for anyone who cares to try this
 out, other than a C++ compiler and the standard library.

So, it turns out that with a relatively expensive to sort collation
like hu_HU.UTF-8, the difference here is huge:

[peter@peterlaptop strxfrm_test]$ ./a.out
locale set: hu_HU.UTF-8
Time elapsed with strcoll: 11.122695 seconds
Time elapsed with strxfrm optimization: 2.328365 seconds
Time elapsed with strcoll: 11.404160 seconds
Time elapsed with strxfrm optimization: 2.355684 seconds
Time elapsed with strcoll: 11.411680 seconds
Time elapsed with strxfrm optimization: 2.342553 seconds
Time elapsed with strcoll: 11.415693 seconds
Time elapsed with strxfrm optimization: 2.343593 seconds
Time elapsed with strcoll: 11.428686 seconds
Time elapsed with strxfrm optimization: 2.345204 seconds
*strcoll average: 11.356583 strxfrm average: 2.343080*

-- 
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sortsupport for text

2012-06-19 Thread Peter Geoghegan

So, just to give a bit more weight to my argument that we should
recognise that equivalent strings ought to be treated identically, I
direct your attention to conformance requirement C9 of Unicode 3.0:

http://www.unicode.org/unicode/standard/versions/enumeratedversions.html#Unicode_3_0_0

This requires that canonically equivalent text be treated
identically. See also C10 and several other places in The Unicode
Standard.

This page, DETERMINISTIC SORTING, was also interesting:

http://www.unicode.org/notes/tn9/

The same hack that we use in varstr_cmp() appears as psuedo-code,
under Forcing Deterministic Comparisons. I'm not sure in what
general sense what *they'd* call a non-deterministic comparison (i.e.
plain old strcoll()) is actually non-deterministic. Clearly a
non-deterministic comparison is deterministic to the extent that
given the same two binary strings and encoding and collation, the
comparison function will always give the same answer.

The article goes on to describe how we could bolt-on a strxfrm() type
value to a string, to ensure deterministic comparison (i.e.
identical behaviour to Postgres master) with pre-processing. Tom did
think that this might still be worth it. I'd rather avoid it.

The article goes on to say of so-called deterministic comparisons:
However, a deterministic comparison is generally not best practice.
First, it has a certain performance cost in comparison, and a quite
substantial impact on sort key size. (For example, [ICU]
language-sensitive sort keys are generally about the size of the
original string, so appending a copy of the original string generally
doubles the size of the sort key.)  A database using these sort keys
can have substantially increased disk footprint and memory footprint,
and consequently reduced performance.

I note also that the The Single UNIX Specification, Version 2 says of
strcoll(): The strxfrm() and strcmp() functions should be used for
sorting large lists. Why has it taken us this long to research this?
I am aware that Greg Stark suggested it back in 2005, but I doubt that
that was the first time.

-- 
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sortsupport for text

2012-06-19 Thread Kevin Grittner

Peter Geoghegan pe...@2ndquadrant.com wrote:
 
 So, just to give a bit more weight to my argument that we should
 recognise that equivalent strings ought to be treated identically
 
Since we appear to be questioning everything in this area, I'll
raise something which has been bugging me for a while: in some other
systems I've used, the tie-breaker comparison for equivalent
values comes after equivalence sorting on *all* sort keys, rather
than *each* sort key.  For example, this much makes sense with
lc_collate = 'en_US.UTF-8':
 
test=# create table c (last_name text not null, first_name text);
CREATE TABLE
test=# insert into c values ('smith', 'bob'), ('smith', 'peter'),
('SMITH', 'EDWARD');
INSERT 0 3
test=# select * from c order by 2;
 last_name | first_name 
---+
 smith | bob
 SMITH | EDWARD
 smith | peter
(3 rows)
 
This seems completely wrong:
 
test=# select * from c order by 1,2;
 last_name | first_name 
---+
 smith | bob
 smith | peter
 SMITH | EDWARD
(3 rows)
 
I have seen other databases which get it in the order I would expect
-- where the C compare only matters within groups of equivalent
rows.  It seems that PostgreSQL effectively orders by:
 
  last_name using collation 'en_US.UTF-8'
  last_name using collation 'C'
  first_name using collation 'en_US.UTF-8'
  first_name using collation 'C'
 
while some other products order by:
 
  last_name using collation 'en_US.UTF-8'
  first_name using collation 'en_US.UTF-8'
  last_name using collation 'C'
  first_name using collation 'C'
 
I'm sure the latter is harder to do and slower to execute; but the
former just doesn't seem defensible as correct.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sortsupport for text

2012-06-19 Thread Peter Geoghegan

On 19 June 2012 16:17, Kevin Grittner kevin.gritt...@wicourts.gov wrote:
 Peter Geoghegan pe...@2ndquadrant.com wrote:

 So, just to give a bit more weight to my argument that we should
 recognise that equivalent strings ought to be treated identically

 Since we appear to be questioning everything in this area, I'll
 raise something which has been bugging me for a while: in some other
 systems I've used, the tie-breaker comparison for equivalent
 values comes after equivalence sorting on *all* sort keys, rather
 than *each* sort key.

Are you sure that they actually have a tie-breaker, and don't just
make the distinction between equality and equivalence (if only
internally)? I would have checked that myself already, but I don't
have access to any other RDBMS that I'd expect to care about these
kinds of distinctions. They make sense for ensuring that the text
comparator's notion of equality is consistent with text's general
notion (if that's bitwise equality, which I suspect it is in these
other products too for the same reasons it is for us). I don't see why
you'd want a tie-breaker across multiple keys. I mean, you could, I
just don't see any reason to.

 test=# select * from c order by 2;
  last_name | first_name
 ---+
  smith     | bob
  SMITH     | EDWARD
  smith     | peter
 (3 rows)

 This seems completely wrong:

 test=# select * from c order by 1,2;
  last_name | first_name
 ---+
  smith     | bob
  smith     | peter
  SMITH     | EDWARD
 (3 rows)

Agreed. Definitely a POLA violation.

 I'm sure the latter is harder to do and slower to execute; but the
 former just doesn't seem defensible as correct.

This same gripe is held by the author of that sorting document I
linked to from the Unicode consortium, with a very similar example. So
it seems like this could be a win from several perspectives, as it
would enable the strxfrm() optimisation. I'm pretty sure that
pg_upgrade wouldn't be very happy about this, so we'd have to have a
legacy compatibility mode.

-- 
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sortsupport for text

2012-06-19 Thread Kevin Grittner

Peter Geoghegan pe...@2ndquadrant.com wrote:
 Kevin Grittner kevin.gritt...@wicourts.gov wrote:
 
 Since we appear to be questioning everything in this area, I'll
 raise something which has been bugging me for a while: in some
 other systems I've used, the tie-breaker comparison for
 equivalent values comes after equivalence sorting on *all* sort
 keys, rather than *each* sort key.
 
 Are you sure that they actually have a tie-breaker, and don't just
 make the distinction between equality and equivalence (if only
 internally)?
 
I'm pretty sure that when I was using Sybase ASE the order for
non-equal values was always predictable, and it behaved in the
manner I describe below.  I'm less sure about any other product.
 
 I don't see why you'd want a tie-breaker across multiple keys. I
 mean, you could, I just don't see any reason to.
 
So that you can have entirely repeatable results across multiple
runs?
 
 test=# select * from c order by 2;
  last_name | first_name
 ---+
  smith | bob
  SMITH | EDWARD
  smith | peter
 (3 rows)

 This seems completely wrong:

 test=# select * from c order by 1,2;
  last_name | first_name
 ---+
  smith | bob
  smith | peter
  SMITH | EDWARD
 (3 rows)
 
 Agreed. Definitely a POLA violation.
 
 I'm sure the latter is harder to do and slower to execute; but
 the former just doesn't seem defensible as correct.
 
 This same gripe is held by the author of that sorting document I
 linked to from the Unicode consortium, with a very similar
 example. So it seems like this could be a win from several
 perspectives, as it would enable the strxfrm() optimisation. I'm
 pretty sure that pg_upgrade wouldn't be very happy about this, so
 we'd have to have a legacy compatibility mode.
 
At a minimum, it could require rebuilding indexes with multiple
columns where any indexed value before the last index column can be
equivalent but not equal.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sortsupport for text

2012-06-19 Thread Peter Geoghegan

On 19 June 2012 17:45, Kevin Grittner kevin.gritt...@wicourts.gov wrote:
 Peter Geoghegan pe...@2ndquadrant.com wrote:
 Are you sure that they actually have a tie-breaker, and don't just
 make the distinction between equality and equivalence (if only
 internally)?

 I'm pretty sure that when I was using Sybase ASE the order for
 non-equal values was always predictable, and it behaved in the
 manner I describe below.  I'm less sure about any other product.

Maybe it used a physical row identifier as a tie-breaker? Note that we
use ItemPointers as a tie-breaker for sorting index tuples.

I imagine that it was at least predictable among columns being sorted,
if only because en_US.UTF-8 doesn't have any notion of equivalence
(that is, it just so happens that there are no two strings that are
equivalent but not bitwise equal). It would surely be impractical to
do a comparison for the entire row, as that could be really expensive.

 I don't see why you'd want a tie-breaker across multiple keys. I
 mean, you could, I just don't see any reason to.

 So that you can have entirely repeatable results across multiple
 runs?

I suppose that isn't an unreasonable use-case, but it would be
unreasonable to require that everyone pay for it if, particularly if
it implied another strcmp(). You don't need me to tell you that we
generally discourage the idea that the order that tuples are returned
in is defined in any way beyond that asked for by the user.

-- 
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sortsupport for text

2012-06-19 Thread Kevin Grittner

Peter Geoghegan pe...@2ndquadrant.com wrote:
 Kevin Grittner kevin.gritt...@wicourts.gov wrote:
 
 I'm pretty sure that when I was using Sybase ASE the order for
 non-equal values was always predictable, and it behaved in the
 manner I describe below.  I'm less sure about any other product.
 
 Maybe it used a physical row identifier as a tie-breaker? Note
 that we use ItemPointers as a tie-breaker for sorting index
 tuples.
 
 I imagine that it was at least predictable among columns being
 sorted, if only because en_US.UTF-8 doesn't have any notion of
 equivalence (that is, it just so happens that there are no two
 strings that are equivalent but not bitwise equal). It would
 surely be impractical to do a comparison for the entire row, as
 that could be really expensive.
 
We weren't using en_US.UTF-8 collation (or any other proper
collation) on Sybase -- I'm not sure whether they even supported
proper collation sequences on the versions we used.  I'm thinking of
when we were using their case insensitive sorting.  I don't know
the implementation details, but the behavior was consistent with
including each character-based column twice: once in the requested
position in the ORDER BY clause but folded to a consistent case, and
again after all the columns in the ORDER BY clause in original form,
with C collation.
 
I wasn't aware that en_US.UTF-8 doesn't have equivalence without
equality.  I guess that surprising result in my last post is just
plain inevitable with that collation then.  Bummer.  Is there
actually anyone who finds that to be a useful behavior?  For a
collation which considered upper-case and lower-case to be
equivalent, would PostgreSQL sort as I wanted, or is it doing some
tie-break per column within equivalent values?
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sortsupport for text

2012-06-19 Thread Peter Geoghegan

On 19 June 2012 18:57, Kevin Grittner kevin.gritt...@wicourts.gov wrote:
 We weren't using en_US.UTF-8 collation (or any other proper
 collation) on Sybase -- I'm not sure whether they even supported
 proper collation sequences on the versions we used.  I'm thinking of
 when we were using their case insensitive sorting.  I don't know
 the implementation details, but the behavior was consistent with
 including each character-based column twice: once in the requested
 position in the ORDER BY clause but folded to a consistent case, and
 again after all the columns in the ORDER BY clause in original form,
 with C collation.

 I wasn't aware that en_US.UTF-8 doesn't have equivalence without
 equality.

Not that many do. The underlying cause of the problem back in 2005 was
the tacit assumption that none do, which presumably we got away with
for a while. I mentioned Hungarian, but it happens a bit in Swedish
too.

PostgreSQL supported Unicode before 2005, when the tie-breaker was
introduced. I know at least one Swede who used Postgres95. I just took
a look at the REL6_4 branch, and it looks much the same in 1999 as it
did in 2005, in that there is no tie-breaker after the strcoll(). Now,
that being the case, and Hungarian in particular having a whole bunch
of these equivalencies, I have to wonder if the original complainant's
problem really was diagnosed correctly. It could of had something to
do with the fact that texteq() was confused about whether it reported
equality or equivalency - it may have taken that long for the (len1 !=
len2) fastpath thing (only holds for equality, not equivalence,
despite the fact that the 2005-era strcoll() call checks equivalence
within texteq() ) to trip someone out, because texteq() would have
thereby given inconsistent answers in a very subtle way, that were not
correct either according to the Hungarian locale, nor according to
simple bitwise equality. That's mostly speculation, but the question
must be asked.

 I guess that surprising result in my last post is just
 plain inevitable with that collation then.  Bummer.  Is there
 actually anyone who finds that to be a useful behavior?

Looking at the details again and assuming a US locale, yeah, it is.
The substance of your complain holds though.

 For a collation which considered upper-case and lower-case to be
 equivalent, would PostgreSQL sort as I wanted, or is it doing some
 tie-break per column within equivalent values?

You could do that, and some people do use custom collations for
various reasons. That's obviously very much of minority interest
though. Most people will just use citext or something. However, since
citext is itself a client of varstr_cmp(), this won't help you.

-- 
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sortsupport for text

2012-06-19 Thread Tom Lane

Kevin Grittner kevin.gritt...@wicourts.gov writes:
 I wasn't aware that en_US.UTF-8 doesn't have equivalence without
 equality.  I guess that surprising result in my last post is just
 plain inevitable with that collation then.  Bummer.  Is there
 actually anyone who finds that to be a useful behavior?  For a
 collation which considered upper-case and lower-case to be
 equivalent, would PostgreSQL sort as I wanted, or is it doing some
 tie-break per column within equivalent values?

Well, there are two different questions there.

As far as the overall structure of an ORDER BY or btree index is
concerned, all it knows is what the datatype comparator functions
tell it.  There is no such thing as a second pass to reconsider
values found to be equal; that's all the info there is.

As far as the behavior of the comparator function for text is concerned,
we choose to break ties reported by strcoll via strcmp.  But that's not
visible from outside the comparator, so there's no notion of an
equivalence behavior different from equality.

I'm not exactly convinced that we could have a second-pass tiebreak
mechanism without creating serious problems for btree indexes; in
particular it seems like ordering considering only some leading keys
might not match the actual index ordering.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sortsupport for text

2012-06-19 Thread Peter Geoghegan

On 19 June 2012 19:44, Peter Geoghegan pe...@2ndquadrant.com wrote:
 You could do that, and some people do use custom collations for
 various reasons. That's obviously very much of minority interest
 though. Most people will just use citext or something. However, since
 citext is itself a client of varstr_cmp(), this won't help you.

I spoke too soon - that would work fine in the U.S, since the text is
converted to lower case on-the-fly in a locale and collation aware
fashion.

-- 
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sortsupport for text

2012-06-18 Thread Peter Geoghegan

On 18 June 2012 00:38, Tom Lane t...@sss.pgh.pa.us wrote:
 The only reason we test a = b and not a  b || a  b
 is that the latter is at least twice as expensive to evaluate.

Perhaps I've been unclear. I meant to write (!(a  b)  !(b  a)),
and not (a  b  b  a). The former is not tautological when the
trichotomy law holds, but the latter is. My mistake - I was a little
tired when I wrote that (though you seemed to know what I meant).

You can sort things just fine if you only have a less than comparator
than returns a boolean, which is what I sought to illustrate with that
example: There is no requirement that the comparator indicate anything
more than whether one element is less than the other. In particular,
it need not tell the sort function that two elements are equal, though
the sort function will still (almost invariably in practice) put equal
elements beside each other. Consider std::set, a highly-generic
template class that enforces uniqueness, usually implemented using a
red-black tree (so its elements are sorted), that only requires that
the datatype it is instantiated with have an operator. The datatype
used may not even have an operator== for std::set to consult if it
wanted to.

So, in principle, std::set could figure out if any two given elements
were equivalent. That is to say, it could determine:

if (!(a  b)  !(b  a))

It could not, even in principle given its implicit
interface/requirements for datatypes, know for sure that:

if (a == b)

Despite the fact that equivalency and equality can be expected to
coincide the vast majority of the time, they are not quite the same
thing. It's perfectly reasonable that it might actually not hold that
a and b are equal, even though they are equivalent, for certain
datatypes. Certainly not for integers. But, for example, it could make
sense to have a string datatype where with a certain locale a certain
Hungarian word was equivalent to the same string with a minor
variation in diacritical marks or whatever (an entirely culturally
defined equivalency), even though it was not equal according to this
datatype's own idiosyncratic notion of equality, which is bitwise
equality.

That would be a datatype that std::set would be entirely capable of
rolling with, because it doesn't care about equality. It might be
better if our text type had the same semantics as this hypothetical
string datatype, because:

1. A culture has gone so far as to make these variations of
diacritical marks or whatever equivalent. The Unicode consortium
consulted with the Hungarian government, or maybe the Hungarian
version of the Académie française, and they asked for this, which
seems pretty serious to me. Maybe they'd like to have a unique
constraint reject what they consider to be equivalent strings. If this
sounds pedantic to you, google Han unification controversy, because
this sounds like that in reverse (Hungarian separation controversy).
The fact that when a unique constraint is violated, it isn't actually
necessary to check the equality operator function (equivalence will
do; no need to call texteq as when selecting with the same value in
the qual) suggests that this wouldn't be too hard to facilitate,
though again, I admit my understanding here is more shallow than I'd
like. Now, I'll grant you that I'm not currently losing any sleep
about our cultural insensitivity, and apparently neither are any
Hungarians; this just serves to illustrate that my position is The
Right Thing (I hope).

Here is a description of the Hungarian problem:

http://blogs.msdn.com/b/michkap/archive/2005/08/10/449909.aspx

It says  the feature is such that since dzs is a compression within
Hungarian, that if you see ddzs it should be treated as equivalent to
dzsdzs...Basically, the feature is such that since dzs is a
compression within Hungarian, that if you see ddzs it should be
treated as equivalent to dzsdzs..

We're not the first people to have to deal with this:

http://bugs.mysql.com/bug.php?id=12519

Here is Tom's original analysis, that lead to this patch:

http://archives.postgresql.org/pgsql-general/2005-12/msg00803.php

2. This way, it's still possible to do the strxfrm() optimisation more
or less for free, without breaking hash methods on text. That is not
to be sniffed at - sorting text is very common and important. Most
people don't know about the C locale, and couldn't use it if they did.

 The last section of src/backend/access/nbtree/README has some notes
 that you might find relevant.

I assume that you're referring to Notes to Operator Class
Implementors. It says:


An = operator must be an equivalence relation; that is, for all non-null
values A,B,C of the datatype:

A = A is true   reflexive law
if A = B, then B = Asymmetric law
if A = B and B = C, then A = C  transitive law


This would still hold; we'd just have to change the = operators here
to something

Re: [HACKERS] sortsupport for text

2012-06-18 Thread Peter Geoghegan

On 18 June 2012 16:59, Peter Geoghegan pe...@2ndquadrant.com wrote:
 Perhaps more importantly, I cannot recreate any of these problems on
 my Fedora 16 machine. Even with hu_HU on LATIN2, Tom's original test
 case (from 2005, on a Fedora 4 machine) cannot be recreated. So it may
 be that they've tightened these things up in some way. It's far from
 clear why that should be.

 It could be worth

Ugh, hit send too soon. I meant to close with the observation that it
may be safe to rely on strcoll() / strxfrm() + strcmp() not ever
returning 0 on certain platforms, if my inability to recreate this is
anything to go by. There could be a test run by configure that
determines this, enabling us to then use the strxfrm() optimisation.

-- 
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sortsupport for text

2012-06-17 Thread Peter Geoghegan

The fly in the ointment for strxfrm() adoption may be the need to be
consistent with this earlier behaviour:

commit 656beff59033ccc5261a615802e1a85da68e8fad
Author: Tom Lane t...@sss.pgh.pa.us
Date:   Thu Dec 22 22:50:00 2005 +

Adjust string comparison so that only bitwise-equal strings are considered
equal: if strcoll claims two strings are equal, check it with strcmp, and
sort according to strcmp if not identical.  This fixes inconsistent
behavior under glibc's hu_HU locale, and probably under some other locales
as well.  Also, take advantage of the now-well-defined behavior to speed up
texteq, textne, bpchareq, bpcharne: they may as well just do a bitwise
comparison and not bother with strcoll at all.

NOTE: affected databases may need to REINDEX indexes on text columns to be
sure they are self-consistent.

Here is the relevant code:

/*
 * In some locales strcoll() can claim that nonidentical 
strings are
 * equal.  Believing that would be bad news for a number of 
reasons,
 * so we follow Perl's lead and sort equal strings according 
to
 * strcmp().
 */
if (result == 0)
result = strcmp(a1p, a2p);

I'm not sure I agree with this decision; why should we presume to know
better than the glibc locale what constitutes equality? What are the
number of reasons referred to? It's seems very likely that the main
one was the then-need to guard against poor quality qsort()
implementations that went quadratic in the face of lots of duplicates,
but we already removed a bunch of other such hacks, because of course
we now control the qsort implementation used, and have since the year
after this commit was made, 2006.

Obviously this decision was made a number of years ago now, and at
least one person went on to rely on this behaviour, so it can only be
revisited with that in mind. However, provided we are able to say
here is a compatibility ordering operator to those that complain
about this, and provided it is appropriately listed as a compatibility
issue in the 9.3 release notes, I think it would be worth reverting
this commit to facilitate strxfrm().

How many people:

A) are using hu_HU or some other locale where this can happen?

and

B) will care?

Now, I'm sure that there is another workaround too, so this doesn't
need to be a blocker even if it is absolutely unacceptable to revert -
but I have to wonder if that's worth it. People don't have any
business relying on a sort order that is consistent in any way other
than the one they actually asked for. A few people still do even as we
go blue in the face telling them not to of course, but that's fairly
normal.

-- 
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sortsupport for text

2012-06-17 Thread Tom Lane

Peter Geoghegan pe...@2ndquadrant.com writes:
 The fly in the ointment for strxfrm() adoption may be the need to be
 consistent with this earlier behaviour:

 if strcoll claims two strings are equal, check it with strcmp, and
 sort according to strcmp if not identical.

 I'm not sure I agree with this decision; why should we presume to know
 better than the glibc locale what constitutes equality?

The killer reason why it must be like that is that you can't use hash
methods on text if text equality is some unknown condition subtly
different from bitwise equality.  My recollection is that there were
some other problems as well, but I'm too lazy to search the archives
for you.

 It's seems very likely that the main
 one was the then-need to guard against poor quality qsort()
 implementations that went quadratic in the face of lots of duplicates,

No, I don't recall that that had anything to do with it.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sortsupport for text

2012-06-17 Thread Peter Geoghegan

On 17 June 2012 17:01, Tom Lane t...@sss.pgh.pa.us wrote:
 I'm not sure I agree with this decision; why should we presume to know
 better than the glibc locale what constitutes equality?

 The killer reason why it must be like that is that you can't use hash
 methods on text if text equality is some unknown condition subtly
 different from bitwise equality.

Fair enough, but I doubt that we need to revert the changes made in
this commit to texteq in addition to the changes I'd like to see in
order to be semantically self-consistent. That is because there is
often a distinction made between equality and equivalence, and we
could adopt this distinction. strcoll() could be said to be just
making a representation that its two arguments are equivalent (and not
necessarily equal) when it returns 0. This distinction is explicitly
made in the C++ standard library, and failing to understand it can
result in bugs:

http://www.cplusplus.com/reference/algorithm/equal_range/

Note the use of the word equivalent rather than equal in the text.
equal_range is a bit of a misnomer.

This distinction is important enough to have warranted an entire
subsection of the book Effective STL by Scott Meyers, a
well-respected expert on the language. This comes up more often than
you'd think - std::set::insert determines if an element already
exists (to know if it must replace it) based on equivalency (usually,
though not necessarily, defined in terms of operator ), whereas the
find algorithm finds elements based on equality (operator==).

 My recollection is that there were
 some other problems as well, but I'm too lazy to search the archives
 for you.

Fair enough. I'll search for it myself later. I'm about to head out now.

 It's seems very likely that the main
 one was the then-need to guard against poor quality qsort()
 implementations that went quadratic in the face of lots of duplicates,

 No, I don't recall that that had anything to do with it.

Oh, okay. It looked very much like the avoid equality at all costs
thing you still see some of in tuplesort.c .

-- 
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sortsupport for text

2012-06-17 Thread Tom Lane

Peter Geoghegan pe...@2ndquadrant.com writes:
 On 17 June 2012 17:01, Tom Lane t...@sss.pgh.pa.us wrote:
 The killer reason why it must be like that is that you can't use hash
 methods on text if text equality is some unknown condition subtly
 different from bitwise equality.

 Fair enough, but I doubt that we need to revert the changes made in
 this commit to texteq in addition to the changes I'd like to see in
 order to be semantically self-consistent. That is because there is
 often a distinction made between equality and equivalence, and we
 could adopt this distinction.

How exactly do you plan to shoehorn that into SQL?  You could invent
some nonstandard equivalence operator I suppose, but what will be the
value?  We aren't going to set things up in such a way that we can't
use hash join or hash aggregation in queries that use the regular =
operator.  IMO there just aren't going to be enough people who care to
use a non-default operator.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sortsupport for text

2012-06-17 Thread Peter Geoghegan

On Jun 17, 2012 5:50 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 Peter Geoghegan pe...@2ndquadrant.com writes:
  On 17 June 2012 17:01, Tom Lane t...@sss.pgh.pa.us wrote:
 How exactly do you plan to shoehorn that into SQL?  You could invent
 some nonstandard equivalence operator I suppose, but what will be the
 value?  We aren't going to set things up in such a way that we can't
 use hash join or hash aggregation in queries that use the regular =
 operator.

Right, most people won't care. You may or may not want a new
Operator for equivalency. The regular operator for equality doesn't have to
and shouldn't change. It is both useful and conceptually clean to not
guarantee that a compator can be relied upon to indicate equality and not
just equivalency.

Re: [HACKERS] sortsupport for text

2012-06-17 Thread Tom Lane

Peter Geoghegan pe...@2ndquadrant.com writes:
 Right, most people won't care. You may or may not want a new
 Operator for equivalency. The regular operator for equality doesn't have to
 and shouldn't change. It is both useful and conceptually clean to not
 guarantee that a compator can be relied upon to indicate equality and not
 just equivalency.

Sure, and in general we only expect that = operators mean equivalency;
a concrete example is float8 =, which on IEEE-spec machines will say
that zero and minus zero are equal.

The trick for hashing such datatypes is to be able to guarantee that
equal values hash to the same hash code, which is typically possible
as long as you know the equality rules well enough.  We could possibly
do that for text with pure-strcoll equality if we knew all the details
of what strcoll would consider equal, but we do not.

See also citext for an example of a datatype where we can manage to
treat distinct textual values as equal and still hash them.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sortsupport for text

2012-06-17 Thread Peter Geoghegan

On 17 June 2012 21:26, Tom Lane t...@sss.pgh.pa.us wrote:
 Sure, and in general we only expect that = operators mean equivalency;
 a concrete example is float8 =, which on IEEE-spec machines will say
 that zero and minus zero are equal.

Right; the spec says that, and we punt to the spec. No one sensible
thinks that minus zero and zero floats are not equal, by virtue of the
fact that if you compare them in C using ==, it evaluates to true.
Equivalency as I use the term can't really be talked about without
talking about sorting or less than operators.

 The trick for hashing such datatypes is to be able to guarantee that
 equal values hash to the same hash code, which is typically possible
 as long as you know the equality rules well enough.  We could possibly
 do that for text with pure-strcoll equality if we knew all the details
 of what strcoll would consider equal, but we do not.

I see. I am tentatively suggesting that we don't change the definition
of equal, but allow that equivalent text values may not be equal.

 See also citext for an example of a datatype where we can manage to
 treat distinct textual values as equal and still hash them.

I'm not talking about textually distinct values being
equal/equivalent. That doesn't quite capture it (although I suppose a
corollary of what I am trying to express is that equivalent though
non-equal values should invariably have different textual
representations in Postgres).

I think a good example that illustrates the difference between
equivalency and equality is the German ß character (esszet), which is
regarded as equivalent to 'ss' (two 's' characters). The following
German words are sorted correctly as required by de_DE.UTF-8:

Arg
Ärgerlich
Arm
Assistant
Aßlar
Assoziation

So if you take the word Aßlar here - that is equivalent to Asslar,
and so strcoll(Aßlar, Asslar) will return 0 if you have the right
LC_COLLATE (if you tried this out for yourself and found that I was
actually lying through my teeth, pretend I said Hungarian instead of
German and some really obscure character rather than ß). It seems a
bit unsatisfactory to me that a unique constraint will happily accept
these two equivalent strings because they're not bitwise identical,
when the collation was supposed to have that normalisation baked in.
At the same time, I find it intuitively obvious that Aßlar ==
Asslar is false, and at the very least I think that very few people
would not grant that it is a not unreasonable state of affairs for
both of those two things to hold, at least on the face of it
(presuming that I wasn't actually lying about the particulars of this,
which I was, since this is apparently confined to less popular locales
and I'm too lazy to research the exact details).

Now, I do realise that there is what might appear to be a tension in
what I'm saying; it is precisely the fact that we can traverse a btree
index using comparators that allows a btree to satisfy an equality
condition (or an equivalency condition; however you choose to
characterise whatever it is that the '=' operator does).

To restate the problem: The '=' operator implies equivalency and not
equality. Or does it? Look at this code from nbtutils.c's
_bt_checkkeys() function:

test = FunctionCall2Coll(key-sk_func, key-sk_collation,
 datum, 
key-sk_argument);

if (!DatumGetBool(test))
{
/*
 * Tuple fails this qual.  If it's a required qual for 
the current
 * scan direction, then we can conclude no further 
tuples will
 * pass, either.
 *
 * Note: because we stop the scan as soon as any 
required equality
 * qual fails, it is critical that equality quals be 
used for the
 * initial positioning in _bt_first() when they are 
available. See
 * comments in _bt_first().
 */
***SNIP***

The qual is verified for the index tuple itself (the test return value
lets us know if it matches) - the equality operator is actually
called, and is actually re-verified via texteq(). So what appears to
happen is that the btree code finds equivalent tuples, and then,
knowing that all pairs of equal tuples are equivalent (but not
necessarily the inverse) checks that it actually has a tuple that's
equal/satisfies the qual. Makes sense, and by this standard I'd judge
that '=' was actually an equality operator that sometimes took
advantage of equivalency purely as an implementation detail, but I'm
not familiar enough with that part of the code to have any degree of
confidence that I haven't made a leap here that I shouldn't have.

ISTM if '=' was really a mere equivalency operator, we'd only every
check (a  b  b  a) in the btree code.

It occurs to me that we might also have a new

Re: [HACKERS] sortsupport for text

2012-06-17 Thread Peter Geoghegan

On 17 June 2012 23:58, Peter Geoghegan pe...@2ndquadrant.com wrote:
 We can decree that equivalency implies equality, or make all this
 internal (which, perversely, I suppose the C++ committee people
 cannot).

Sorry, that should obviously read equality implies equivalency. We
may not have to decree it, because it may already be a tacit
assumption - I'm not sure.

-- 
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sortsupport for text

2012-06-17 Thread Tom Lane

Peter Geoghegan pe...@2ndquadrant.com writes:
 ISTM if '=' was really a mere equivalency operator, we'd only every
 check (a  b  b  a) in the btree code.

You're not really making a lot of sense here, or at least I'm not
grasping the distinction you want to draw.  btree indexes (and sorting
in general) require the trichotomy law to hold, so what you say above is
tautological.  The only reason we test a = b and not a  b || a  b
is that the latter is at least twice as expensive to evaluate.
The last section of src/backend/access/nbtree/README has some notes
that you might find relevant.

 Simple question: if you were to just remove the strcmp tie-breaker for
 strcoll() in varstr_cmp(), but not touch anything else, would Postgres
 exhibit objectively incorrect behaviour?

Yes, it would, and did, before we put that in; see the archives for the
discussions that led up to the patch you mentioned earlier.

 So, I may have lost sight of why I starting on about equivalency,
 which is that it sure would be nice if we could use strxfrm to prepare
 strings for sorting, which looks to be a fairly significant win.

We could still do that as long as we were willing to store the original
strings as well as the strxfrm output.  Given the memory bloat already
implied by strxfrm, I don't think that's necessarily ridiculous on its
face --- it just makes the bar a bit higher for whether this is a win.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sortsupport for text

2012-06-16 Thread Peter Geoghegan

On 18 March 2012 15:08, Tom Lane t...@sss.pgh.pa.us wrote:
 However, it occurred to me that we could pretty easily jury-rig
 something that would give us an idea about the actual benefit available
 here.  To wit: make a C function that wraps strxfrm, basically
 strxfrm(text) returns bytea.  Then compare the performance of
 ORDER BY text_col to ORDER BY strxfrm(text_col).

 (You would need to have either both or neither of text and bytea
 using the sortsupport code paths for this to be a fair comparison.)

I thought this was an interesting idea, so decided to try it out for
myself. I tried this out against master (not Robert's patch, per Tom's
direction). The results were interesting:

[peter@peterlaptop strxfrm_test]$ pgbench postgres -T 60 -f sort_strxfrm.sql -n
transaction type: Custom query
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
duration: 60 s
number of transactions actually processed: 2795
tps = 46.563970 (including connections establishing)
tps = 46.568234 (excluding connections establishing)
[peter@peterlaptop strxfrm_test]$ pgbench postgres -T 60 -f sort_reg.sql -n
transaction type: Custom query
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
duration: 60 s
number of transactions actually processed: 2079
tps = 34.638838 (including connections establishing)
tps = 34.640665 (excluding connections establishing)

The first test executed the following query against the dellstore database:

select * from products order by strxfrm_test(actor) offset 10001;

The second:

select * from products order by actor offset 10001;

So, this was pretty good - an improvement that is completely
independent of Robert's. Bear in mind, this simple demonstration adds
additional fmgr overhead, which we have plenty of reason to believe
could hurt things, besides which each call must allocate memory that
could perhaps be avoided. In addition, I don't know enough about
locale-aware sorting and related algorithms to have devised a test
that would stress strxfrm()/ strcoll() - these were all strings that
could be represented as ASCII.

In light of this, I think there is a pretty strong case to be made for
pre-processing text via strxfrm() as part of this patch.

Thoughts?

-- 
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services


strxfrm_test.tar.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sortsupport for text

2012-06-15 Thread Tom Lane

Peter Geoghegan pe...@2ndquadrant.com writes:
 On 14 June 2012 19:28, Robert Haas robertmh...@gmail.com wrote:
 I thought that doubling repeatedly would be overly aggressive in terms
 of memory usage.

 I fail to understand how this sortsupport buffer fundamentally differs
 from a generic dynamic array abstraction built to contain chars. That
 being the case, I see no reason not to just do what everyone else does
 when expanding dynamic arrays, and no reason why we shouldn't make
 essentially the same time-space trade-off here as others do elsewhere.

I agree with Peter on this one; not only is double-each-time the most
widespread plan, but it is what we do in just about every other place
in Postgres that needs a dynamically expansible buffer.  If you do it
randomly differently here, readers of the code will be constantly
stopping to wonder why it's different here and if that's a bug or not.
(And from a performance standpoint, I'm not entirely convinced it's not
a bug, anyway.  Worst-case behavior could be pretty bad.)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sortsupport for text

2012-06-15 Thread Robert Haas

On Fri, Jun 15, 2012 at 12:22 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Peter Geoghegan pe...@2ndquadrant.com writes:
 On 14 June 2012 19:28, Robert Haas robertmh...@gmail.com wrote:
 I thought that doubling repeatedly would be overly aggressive in terms
 of memory usage.

 I fail to understand how this sortsupport buffer fundamentally differs
 from a generic dynamic array abstraction built to contain chars. That
 being the case, I see no reason not to just do what everyone else does
 when expanding dynamic arrays, and no reason why we shouldn't make
 essentially the same time-space trade-off here as others do elsewhere.

 I agree with Peter on this one; not only is double-each-time the most
 widespread plan, but it is what we do in just about every other place
 in Postgres that needs a dynamically expansible buffer.  If you do it
 randomly differently here, readers of the code will be constantly
 stopping to wonder why it's different here and if that's a bug or not.

That could, of course, be addressed by adding a comment.

 (And from a performance standpoint, I'm not entirely convinced it's not
 a bug, anyway.  Worst-case behavior could be pretty bad.)

Instead of simply asserting that, could you respond to the specific
points raised in my analysis?  I think there's no way it can be bad.
I am happy to be proven wrong, but I like to understand why it is that
I am wrong before changing things.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sortsupport for text

2012-06-15 Thread Tom Lane

Robert Haas robertmh...@gmail.com writes:
 On Fri, Jun 15, 2012 at 12:22 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 (And from a performance standpoint, I'm not entirely convinced it's not
 a bug, anyway.  Worst-case behavior could be pretty bad.)

 Instead of simply asserting that, could you respond to the specific
 points raised in my analysis?  I think there's no way it can be bad.
 I am happy to be proven wrong, but I like to understand why it is that
 I am wrong before changing things.

Maybe I missed something, but as far as I saw your argument was not that
the performance wasn't bad but that the rest of the sort code would
dominate the runtime anyway.  I grant that entirely, but that doesn't
mean that it's good for this piece of it to possibly have bad behavior.

In any case, it seems to me that if the bottom line is that the
performance of this piece of code isn't going to matter overall,
then we might as well use the simplest, least surprising implementation
we can.  And I concur with Peter that a doubling-based approach meets
that description, while what you've got here doesn't.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sortsupport for text

2012-06-15 Thread Robert Haas

On Fri, Jun 15, 2012 at 1:45 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Fri, Jun 15, 2012 at 12:22 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 (And from a performance standpoint, I'm not entirely convinced it's not
 a bug, anyway.  Worst-case behavior could be pretty bad.)

 Instead of simply asserting that, could you respond to the specific
 points raised in my analysis?  I think there's no way it can be bad.
 I am happy to be proven wrong, but I like to understand why it is that
 I am wrong before changing things.

 Maybe I missed something, but as far as I saw your argument was not that
 the performance wasn't bad but that the rest of the sort code would
 dominate the runtime anyway.  I grant that entirely, but that doesn't
 mean that it's good for this piece of it to possibly have bad behavior.

That, plus the fact that not wasting memory in code paths where memory
is at a premium seems important to me.  I'm shocked that either of you
think it's OK to overallocate by as much as 2X in a code path that's
only going to be used when we're going through fantastic gyrations to
make memory usage fit inside work_mem.  The over-allocation by itself
could easily exceed work_mem.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sortsupport for text

2012-06-15 Thread Peter Geoghegan

On 15 June 2012 21:06, Robert Haas robertmh...@gmail.com wrote:
 On Fri, Jun 15, 2012 at 1:45 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Fri, Jun 15, 2012 at 12:22 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 (And from a performance standpoint, I'm not entirely convinced it's not
 a bug, anyway.  Worst-case behavior could be pretty bad.)

 Instead of simply asserting that, could you respond to the specific
 points raised in my analysis?  I think there's no way it can be bad.
 I am happy to be proven wrong, but I like to understand why it is that
 I am wrong before changing things.

 Maybe I missed something, but as far as I saw your argument was not that
 the performance wasn't bad but that the rest of the sort code would
 dominate the runtime anyway.  I grant that entirely, but that doesn't
 mean that it's good for this piece of it to possibly have bad behavior.

 That, plus the fact that not wasting memory in code paths where memory
 is at a premium seems important to me.  I'm shocked that either of you
 think it's OK to overallocate by as much as 2X in a code path that's
 only going to be used when we're going through fantastic gyrations to
 make memory usage fit inside work_mem.  The over-allocation by itself
 could easily exceed work_mem.

That seems pretty thin to me. We're talking about a couple of buffers
whose ultimate size is only approximately just big enough to hold the
largest text datum seen when sorting. Meanwhile, if it's the leading
key we're dealing with (and of course, it usually will be), before
exceeding work_mem all of the *entire* set of strings to be sorted are
sitting in palloc()'d memory anyway. I'm surprised that you didn't
immediately concede the point, to be honest.

-- 
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sortsupport for text

2012-06-15 Thread Tom Lane

Robert Haas robertmh...@gmail.com writes:
 On Fri, Jun 15, 2012 at 1:45 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Maybe I missed something, but as far as I saw your argument was not that
 the performance wasn't bad but that the rest of the sort code would
 dominate the runtime anyway.  I grant that entirely, but that doesn't
 mean that it's good for this piece of it to possibly have bad behavior.

 That, plus the fact that not wasting memory in code paths where memory
 is at a premium seems important to me.  I'm shocked that either of you
 think it's OK to overallocate by as much as 2X in a code path that's
 only going to be used when we're going through fantastic gyrations to
 make memory usage fit inside work_mem.  The over-allocation by itself
 could easily exceed work_mem.

I would be concerned about this if it were per-sort-tuple wastage, but
what I understood to be happening was that this was a single instance of
an expansible buffer (per sort, perhaps, but still just one buffer).
And, as you keep pointing out when it suits your argument, it's
relatively uncommon to be sorting enormous values anyway.  So no, I am
not particularly worried about that.  If you are, there are more
important places to be concerned about allocation pad wastage, starting
with palloc itself.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sortsupport for text

2012-06-14 Thread Peter Geoghegan

On 18 March 2012 15:08, Tom Lane t...@sss.pgh.pa.us wrote:
 One other thing I've always wondered about in this connection is the
 general performance of sorting toasted datums.  Is it better to detoast
 them in every comparison, or pre-detoast to save comparison cycles at
 the cost of having to push much more data around?  I didn't see any
 discussion of this point in Robert's benchmarks, but I don't think we
 should go very far towards enabling sortsupport for text until we
 understand the issue and know whether we need to add more infrastructure
 for it.  If you cross your eyes a little bit, this is very much like
 the strxfrm question...

I see the parallels. I note that glibc's strcoll_l() is implemented
entirely in C (strcoll() itself is implemented in terms of strcoll_l()
), whereas the various strcmp.S are written in hand-optimized
assembler, with SSE3 instructions in the Highly optimized version for
x86-64, for example. I wonder just how important a factor that is. I
suppose the reason why the glibc guys haven't just done something
equivalent internally might be that they much prefer to perform the
comparison in-place, due to the need to target a conservative lowest
common denominator...or it could be because it just doesn't matter
that much.

-- 
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sortsupport for text

2012-06-14 Thread Robert Haas

On Thu, Jun 14, 2012 at 11:36 AM, Peter Geoghegan pe...@2ndquadrant.com wrote:
 On 18 March 2012 15:08, Tom Lane t...@sss.pgh.pa.us wrote:
 One other thing I've always wondered about in this connection is the
 general performance of sorting toasted datums.  Is it better to detoast
 them in every comparison, or pre-detoast to save comparison cycles at
 the cost of having to push much more data around?  I didn't see any
 discussion of this point in Robert's benchmarks, but I don't think we
 should go very far towards enabling sortsupport for text until we
 understand the issue and know whether we need to add more infrastructure
 for it.  If you cross your eyes a little bit, this is very much like
 the strxfrm question...

 I see the parallels.

The problem with pre-detoasting to save comparison cycles is that you
can now fit many, many fewer tuples in work_mem.  There might be cases
where it wins (for example, because the entire data set fits even
after decompressing everything) but in most cases it seems like a
loser.

Also, my guess is that most values people sort by are pretty short,
making this concern mostly academic.  Suppose you are sorting a bunch
of strings which might be either 100 characters in length or 1MB.  If
they're all 100 characters, you probably don't need to detoast.  If
they're all 1MB, you probably can't detoast without eating up a ton of
memory (and even if you have it, this might not be the best use for
it).  If you have a mix, detoasting might be affordable provided that
the percentage of long strings is small, but it's also not going to
save you much, because if the percentage of long strings is small,
then most comparisons will be between two short strings where we don't
save anything anyway.

All things considered, this seems to me to be aiming at a pretty
narrow target, but maybe I'm just not thinking about it creatively
enough.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sortsupport for text

2012-06-14 Thread Peter Geoghegan

On 14 June 2012 17:35, Robert Haas robertmh...@gmail.com wrote:
 The problem with pre-detoasting to save comparison cycles is that you
 can now fit many, many fewer tuples in work_mem.  There might be cases
 where it wins (for example, because the entire data set fits even
 after decompressing everything) but in most cases it seems like a
 loser.

If I had to guess, I'd say you're probably right about that -
optimising sorting toasted text doesn't seem like a terribly sensible
use of your time.

What about the strxfrm suggestion of Greg's? You might find that the
added benefit of being able to avail of a highly optimised strcmp()
tipped the balance in favour of that idea, beyond the simple fact that
there's only a linear number of what you might loosely call strcoll_l
units of work rather than as many as O(n ^ 2). Furthermore, I'd
speculate that if you were to interlace the strxfrm() calls with
copying each text string, somewhere like within a specialised
datumCopy(), that would make the approach more efficient still, as you
specify a location for the blob in the just-palloc()'d leading-key
private memory directly, rather than just using memcpy.

-- 
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sortsupport for text

2012-06-14 Thread Peter Geoghegan

On 2 March 2012 20:45, Robert Haas robertmh...@gmail.com wrote:
 I decided to investigate the possible virtues of allowing text to
 use the sortsupport infrastructure, since strings are something people
 often want to sort.

I should mention up-front that I agree with the idea that it is worth
optimising text sorting because it is a very common thing to have to
do, and therefore the standard for inclusion ought to be lower. I
don't intend to talk about tapesort though - that isn't really fair,
not least because I have some serious doubts about the quality of our
implementation. Furthermore, I think that it is logical that doing
things like resolving collations occur within a preparatory function
in advance of sorting, rather than redundantly doing that for each and
every comparison.

Why have you made the reusable buffer managed by sortsupport
TEXTBUFLEN-aligned? The existing rationale for that constant (whose
value is 1024) does not seem to carry forward here:

 * This should be large enough that most strings will be fit, but small
 * enough that we feel comfortable putting it on the stack.

ISTM it would be on average worth the hit of having to repalloc a few
more times for larger strings by making that buffer much smaller
initially, and doubling its size each time that proved insufficient,
rather than increasing its size to the smallest possible
TEXTBUFLEN-aligned size that you can get away with for the immediately
subsequent memcpy. Realistically, any database I've ever worked with
would probably be able to fit a large majority of its text strings
into 16 chars of memory - you yourself said that sorting toasted text
isn't at all common.

-- 
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sortsupport for text

2012-06-14 Thread Robert Haas

On Thu, Jun 14, 2012 at 1:56 PM, Peter Geoghegan pe...@2ndquadrant.com wrote:
 On 14 June 2012 17:35, Robert Haas robertmh...@gmail.com wrote:
 The problem with pre-detoasting to save comparison cycles is that you
 can now fit many, many fewer tuples in work_mem.  There might be cases
 where it wins (for example, because the entire data set fits even
 after decompressing everything) but in most cases it seems like a
 loser.

 If I had to guess, I'd say you're probably right about that -
 optimising sorting toasted text doesn't seem like a terribly sensible
 use of your time.

 What about the strxfrm suggestion of Greg's? You might find that the
 added benefit of being able to avail of a highly optimised strcmp()
 tipped the balance in favour of that idea, beyond the simple fact that
 there's only a linear number of what you might loosely call strcoll_l
 units of work rather than as many as O(n ^ 2). Furthermore, I'd
 speculate that if you were to interlace the strxfrm() calls with
 copying each text string, somewhere like within a specialised
 datumCopy(), that would make the approach more efficient still, as you
 specify a location for the blob in the just-palloc()'d leading-key
 private memory directly, rather than just using memcpy.

Well, it's still got the problem of blowing up memory usage.  I just
can't get excited about optimizing for the case where we can consume
10x the memory and still fit in work_mem.  If we've got that case, the
sort is gonna be pretty fast anyway.   The case where preprocessing
wins is when there are going to be a large number of comparisons
against each tuple - i.e. lg(N) is large.  But the cases where we
could pre-transform with strxfrm are those where the data fits in a
small percentage of work mem - i.e. lg(N) is small.  I'm open to
somebody showing up with a test result that demonstrates that it's
worthwhile, but to me it seems like it's chasing diminishing returns.

The point of this patch isn't really to improve things for the
collation-aware case, although it's nice that it does.  The point is
rather to shave off a double-digit percentage off the time it takes to
do the sort required to build a C-collation index, which is what
people should be using when they don't care about  and , which most
don't.  Despite Tom's concerns, I don't think there's anything in this
patch that can't be fairly easily revised at a later date if we decide
we want a different API.  I think it's worth picking the low-hanging
fruit in the meantime.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sortsupport for text

2012-06-14 Thread Robert Haas

On Thu, Jun 14, 2012 at 2:10 PM, Peter Geoghegan pe...@2ndquadrant.com wrote:
 Why have you made the reusable buffer managed by sortsupport
 TEXTBUFLEN-aligned? The existing rationale for that constant (whose
 value is 1024) does not seem to carry forward here:

  * This should be large enough that most strings will be fit, but small
  * enough that we feel comfortable putting it on the stack.

Well, as the comments explain:

+   /*
+* We avoid repeated palloc/pfree on long strings by keeping the buffers
+* we allocate around for the duration of the sort.  When we
expand them,
+* we round off the to the next multiple of TEXTBUFLEN in order to avoid
+* repeatedly expanding them by very small amounts.
+*/

 ISTM it would be on average worth the hit of having to repalloc a few
 more times for larger strings by making that buffer much smaller
 initially, and doubling its size each time that proved insufficient,
 rather than increasing its size to the smallest possible
 TEXTBUFLEN-aligned size that you can get away with for the immediately
 subsequent memcpy. Realistically, any database I've ever worked with
 would probably be able to fit a large majority of its text strings
 into 16 chars of memory - you yourself said that sorting toasted text
 isn't at all common.

I thought that doubling repeatedly would be overly aggressive in terms
of memory usage.  Blowing the buffers out to 8kB because we hit a
string that's a bit over 4kB isn't so bad, but blowing them out to
256MB because we hit a string that's a bit over 128MB seems a bit
excessive.

Also, I don't think it really saves anything from a performance point
of view.  The worst case for this is if we're asked to repeatedly
compare strings that expand the buffer by a kilobyte each time.
First, how likely is that to happen in a real world test case?  In
many cases, most of the input strings will be of approximately equal
length; also in many cases, that length will be short; even if the
lengths take the worst possible distribution, we have to hit them in
the worst possible order for this to be a problem.  Second, even if it
does happen, does it really matter?  Suppose we expand the buffer a
kilobyte at a time from an initial size of 1kB all the way out to
256MB.  That's 256,000 palloc calls, so we must be sorting at least
256,000 datums, at least 128,000 of which are longer than 128MB.  I
think the cost of calling memcpy() and strcoll() repeatedly on all
those long datums - not to mention repeatedly detoasting them - is
going to bludgeon the palloc overhead into complete insignificance.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sortsupport for text

2012-06-14 Thread Peter Geoghegan

On 14 June 2012 19:28, Robert Haas robertmh...@gmail.com wrote:
 I thought that doubling repeatedly would be overly aggressive in terms
 of memory usage.  Blowing the buffers out to 8kB because we hit a
 string that's a bit over 4kB isn't so bad, but blowing them out to
 256MB because we hit a string that's a bit over 128MB seems a bit
 excessive.

That's pretty much what all popular dynamic array implementations do,
from C++'s std::vector to Python's list (it's a misnomer). Having to
allocate 256MB for a buffer to contain a string a bit over 128MB may
seem excessive, until you later get a 250MB string. Even if doubling
is generally excessive, which I doubt, that's beside the point, which
is that expanding the array by some constant proportion results in
each insertion taking amortized constant time.

 Suppose we expand the buffer a
 kilobyte at a time from an initial size of 1kB all the way out to
 256MB.  That's 256,000 palloc calls, so we must be sorting at least
 256,000 datums, at least 128,000 of which are longer than 128MB.  I
 think the cost of calling memcpy() and strcoll() repeatedly on all
 those long datums - not to mention repeatedly detoasting them - is
 going to bludgeon the palloc overhead into complete insignificance.

I fail to understand how this sortsupport buffer fundamentally differs
from a generic dynamic array abstraction built to contain chars. That
being the case, I see no reason not to just do what everyone else does
when expanding dynamic arrays, and no reason why we shouldn't make
essentially the same time-space trade-off here as others do elsewhere.

Another concern is that it seems fairly pointless to have two buffers.
Wouldn't it be more sensible to have a single buffer that was
partitioned to make two logical, equally-sized buffers, given that in
general each buffer is expected to grow at exactly the same rate?

-- 
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sortsupport for text

2012-06-14 Thread Robert Haas

On Thu, Jun 14, 2012 at 3:24 PM, Peter Geoghegan pe...@2ndquadrant.com wrote:
 On 14 June 2012 19:28, Robert Haas robertmh...@gmail.com wrote:
 I thought that doubling repeatedly would be overly aggressive in terms
 of memory usage.  Blowing the buffers out to 8kB because we hit a
 string that's a bit over 4kB isn't so bad, but blowing them out to
 256MB because we hit a string that's a bit over 128MB seems a bit
 excessive.

 That's pretty much what all popular dynamic array implementations do,
 from C++'s std::vector to Python's list (it's a misnomer). Having to
 allocate 256MB for a buffer to contain a string a bit over 128MB may
 seem excessive, until you later get a 250MB string. Even if doubling
 is generally excessive, which I doubt, that's beside the point, which
 is that expanding the array by some constant proportion results in
 each insertion taking amortized constant time.

Yeah, but *it doesn't matter*.  If you test this on strings that are
long enough that they get pushed out to TOAST, you'll find that it
doesn't measurably improve performance, because the overhead of
detoasting so completely dominates any savings on the palloc side that
you can't pick them out of the inter-run noise.  Risking eating up an
extra 100MB of memory that we don't really need in order to obtain a
performance optimization that is far too small to measure does not
make sense.  The case with std::vector is not analagous; they don't
have any way of knowing what other overhead you are incurring between
insertions into the vector, so it's reasonable to support that the
cost of the vector insertions themselves might be material.  Here we
know that it doesn't matter, so the application of Knuth's first law
of optimization is appropriate.

 Suppose we expand the buffer a
 kilobyte at a time from an initial size of 1kB all the way out to
 256MB.  That's 256,000 palloc calls, so we must be sorting at least
 256,000 datums, at least 128,000 of which are longer than 128MB.  I
 think the cost of calling memcpy() and strcoll() repeatedly on all
 those long datums - not to mention repeatedly detoasting them - is
 going to bludgeon the palloc overhead into complete insignificance.

 I fail to understand how this sortsupport buffer fundamentally differs
 from a generic dynamic array abstraction built to contain chars. That
 being the case, I see no reason not to just do what everyone else does
 when expanding dynamic arrays, and no reason why we shouldn't make
 essentially the same time-space trade-off here as others do elsewhere.

 Another concern is that it seems fairly pointless to have two buffers.
 Wouldn't it be more sensible to have a single buffer that was
 partitioned to make two logical, equally-sized buffers, given that in
 general each buffer is expected to grow at exactly the same rate?

Sure, but it would be making the code more complicated in return for
no measurable performance benefit.  We generally avoid that.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sortsupport for text

2012-06-14 Thread Peter Geoghegan

On 14 June 2012 20:32, Robert Haas robertmh...@gmail.com wrote:
 Yeah, but *it doesn't matter*.  If you test this on strings that are
 long enough that they get pushed out to TOAST, you'll find that it
 doesn't measurably improve performance, because the overhead of
 detoasting so completely dominates any savings on the palloc side that
 you can't pick them out of the inter-run noise.

That's probably true, but it's also beside the point. As recently as a
few hours ago, you yourself said my guess is that most values people
sort by are pretty short, making this concern mostly academic. Why
are you getting hung up on toasting now?

 Here we know that it doesn't matter, so the application of Knuth's first law
 of optimization is appropriate.

I'm not advocating some Byzantine optimisation, or even something that
could reasonably be described as an optimisation at all here. I'm
questioning why you've unnecessarily complicated the code by having
the buffer size just big enough to fit the biggest value seen so far,
but arbitrarily aligned to a value that is completely irrelevant to
bttextfastcmp_locale(), rather than using simple geometric expansion,
which is more or less the standard way of managing the growth of a
dynamic array.

You have to grow the array in some way. The basic approach I've
outlined has something to recommend it - why does it make sense to
align the size of the buffer to TEXTBUFLEN in particular though? It's
quite easy to imagine what you've done here resulting in an excessive
number of allocations (and pfree()s), which *could* be expensive. If
you're so conservative about allocating memory, don't grow the array
at quite so aggressive a rate as doubling it each time.

There is a trade-off between space and time to be made here, but I
don't know why you think that the right choice is to use almost the
smallest possible amount of memory in all cases.

 Another concern is that it seems fairly pointless to have two buffers.
 Wouldn't it be more sensible to have a single buffer that was
 partitioned to make two logical, equally-sized buffers, given that in
 general each buffer is expected to grow at exactly the same rate?

 Sure, but it would be making the code more complicated in return for
 no measurable performance benefit.  We generally avoid that.

Fair enough.

-- 
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sortsupport for text

2012-06-14 Thread Robert Haas

On Thu, Jun 14, 2012 at 6:30 PM, Peter Geoghegan pe...@2ndquadrant.com wrote:
 Here we know that it doesn't matter, so the application of Knuth's first law
 of optimization is appropriate.

 I'm not advocating some Byzantine optimisation, or even something that
 could reasonably be described as an optimisation at all here. I'm
 questioning why you've unnecessarily complicated the code by having
 the buffer size just big enough to fit the biggest value seen so far,
 but arbitrarily aligned to a value that is completely irrelevant to
 bttextfastcmp_locale(), rather than using simple geometric expansion,
 which is more or less the standard way of managing the growth of a
 dynamic array.

Well, so... palloc, and most malloc implementations, don't actually
just double every time forever.  They double up to some relatively
small number like 1MB, and then they expand in 1MB chunks.
std::vector may well behave as you're describing, but that's doing
something completely different.  We're not accumulating a string of
unknown length into a buffer, with the risk of having to copy the
whole buffer every time we reallocate.  We know the exact size of the
buffer we need for any given string, so the cost of a pfree/palloc is
only the cost of releasing and allocating memory; there is no
additional copying cost, as there would be for std::vector.

Look at it another way.  Right now, the worst case number of temporary
buffer allocations is about 2N lg N, assuming we do about N lg N
comparisons during a sort.  If we simply implemented the most naive
buffer reallocation algorithm possible, we might in the worst case
reallocate each buffer up to N times.  That is already better than
what we do now by a factor of lg N.  If we suppose N is about a
million, that's an improvement of 20x *in the worst case* - typically,
the improvement will be much larger, because all the strings will be
of similar length or we won't hit them in exactly increasing order of
length.  The algorithm I've actually implemented bounds the worst case
1024 times more tightly - given N strings, we can't need to enlarge
the buffer more than N/1024 times.  So with N of around a million,
this algorithm should eliminate *at least* 99.995% of the pallocs we
currently do .  How much better does it need to be?

 You have to grow the array in some way. The basic approach I've
 outlined has something to recommend it - why does it make sense to
 align the size of the buffer to TEXTBUFLEN in particular though?

There's nothing magic about TEXTBUFLEN as far as that goes.  We could
round to the nearest multiple of 8kB or whatever if that seemed
better.  But if we rounded to, say, the nearest 1MB, then someone
sorting strings that are 2kB long would use 2MB of memory for these
buffers.  Considering that we ship with work_mem = 1MB, that could
easily end up wasting almost twice work_mem.  I don't see how you can
view that as a trivial problem.  It might be worth sucking that up if
it seemed likely to dramatically improve performance, but it doesn't.

 It's
 quite easy to imagine what you've done here resulting in an excessive
 number of allocations (and pfree()s), which *could* be expensive.

Actually, I can't imagine that at all, per the above analysis.  If I
could, I might agree with you.  :-)

 There is a trade-off between space and time to be made here, but I
 don't know why you think that the right choice is to use almost the
 smallest possible amount of memory in all cases.

Because I think that with the current implementation I can have my
cake and eat it, too.  I believe I've gotten essentially all the
available speedup for essentially no memory wastage.  If you can
convince me that I've missed something and there is still a meaningful
amount of palloc overhead left to be squeezed out, I'm all ears.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sortsupport for text

2012-03-19 Thread Robert Haas

On Sat, Mar 17, 2012 at 6:58 PM, Greg Stark st...@mit.edu wrote:
 On Fri, Mar 2, 2012 at 8:45 PM, Robert Haas robertmh...@gmail.com wrote:
 12789    28.2686  libc-2.13.so             strcoll_l
 6802     15.0350  postgres                 text_cmp

 I'm still curious how it would compare to call strxfrm and sort the
 resulting binary blobs. I don't think the sortsupport stuff actually
 makes this any easier though. Since using it requires storing the
 binary blob somewhere I think the support would have to be baked into
 tuplesort (or hacked into the sortkey as an expr that was evaluated
 earlier somehow).

Well, the real problem here is that the strxfrm'd representations
aren't just bigger - they are huge.  On MacBook Pro, if the input
representation is n characters, the strxfrm'd representation is 9x+3
characters.  If the we're sorting very wide tuples of which the sort
key is only a small part, maybe that would be acceptable, but if the
sort key makes up the bulk of the tuple than caching the strxfrm()
representation works out to slashing work_mem tenfold.  That might be
just fine if the sort is going to fit in work_mem either way, but if
it turns a quicksort into a heap sort then I feel pretty confident
that it's going to be a loser.  Keep in mind that even if the
strxfrm'd representation were no larger at all, it would still amount
to an additional copy of the data, so you'd still potentially be
eating up lots of work_mem that way.  The fact that it's an order of
magnitude larger is just making a bad problem worse.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sortsupport for text

2012-03-19 Thread Robert Haas

On Sun, Mar 18, 2012 at 11:08 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 However, it occurred to me that we could pretty easily jury-rig
 something that would give us an idea about the actual benefit available
 here.  To wit: make a C function that wraps strxfrm, basically
 strxfrm(text) returns bytea.  Then compare the performance of
 ORDER BY text_col to ORDER BY strxfrm(text_col).

 (You would need to have either both or neither of text and bytea
 using the sortsupport code paths for this to be a fair comparison.)

Since the index will be ~9x bigger at least on this machine, I think I
know the answer, but I suppose it doesn't hurt to test it.  It's not
that much work.

 One other thing I've always wondered about in this connection is the
 general performance of sorting toasted datums.  Is it better to detoast
 them in every comparison, or pre-detoast to save comparison cycles at
 the cost of having to push much more data around?  I didn't see any
 discussion of this point in Robert's benchmarks, but I don't think we
 should go very far towards enabling sortsupport for text until we
 understand the issue and know whether we need to add more infrastructure
 for it.  If you cross your eyes a little bit, this is very much like
 the strxfrm question...

It would be surprising to me if there is a one-size-fits-all answer to
this question.  For example, if the tuple got toasted because it's got
lots of columns and we had to take desperate measures to get it to fit
into an 8kB block at all, chances are that detoasting will work out
well.  We'll use a bit more memory, but hopefully that'll be repaid by
much faster comparisons.  OTOH, if you have a data set with many
relatively short strings and a few very long ones, detoasting up-front
could turn a quicksort into a heapsort.  Since only a small fraction
of the comparisons would have involved one of the problematic long
strings anyway, it's unlikely to be worth the expense of keeping those
strings around in detoasted form for the entire sort (unless maybe
reconstructing them even a few times is problematic because we're
under heavy cache pressure and we get lots of disk seeks as a result).

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sortsupport for text

2012-03-19 Thread Martijn van Oosterhout

On Mon, Mar 19, 2012 at 12:19:53PM -0400, Robert Haas wrote:
 On Sat, Mar 17, 2012 at 6:58 PM, Greg Stark st...@mit.edu wrote:
  On Fri, Mar 2, 2012 at 8:45 PM, Robert Haas robertmh...@gmail.com wrote:
  12789    28.2686  libc-2.13.so             strcoll_l
  6802     15.0350  postgres                 text_cmp
 
  I'm still curious how it would compare to call strxfrm and sort the
  resulting binary blobs. I don't think the sortsupport stuff actually
  makes this any easier though. Since using it requires storing the
  binary blob somewhere I think the support would have to be baked into
  tuplesort (or hacked into the sortkey as an expr that was evaluated
  earlier somehow).
 
 Well, the real problem here is that the strxfrm'd representations
 aren't just bigger - they are huge.  On MacBook Pro, if the input
 representation is n characters, the strxfrm'd representation is 9x+3
 characters.  

Ouch. I was holding out hope that you could get a meaningful
improvement if we could use the first X bytes of the strxfrm output so
you only need to do a strcoll on strings that actually nearly match.
But with an information density of 9 bytes for one 1 character it
doesn't seem worthwhile.

That and this gem in the strxfrm manpage:

RETURN VALUE
   The  strxfrm()  function returns the number of bytes required to
   store the transformed string in dest excluding the terminating
   '\0' character.  If the value returned is n or more, the
   contents of dest are indeterminate.

Which means that you have to take the entire transformed string, you
can't just ask for the first bit. I think that kind of leaves the whole
idea dead in the water.

Just for interest I looked at the ICU API for this and they have the
same restriction.  There is another function which you can use to
return partial sort keys (ucol_nextSortKeyPart) but it produces
uncompressed sortkeys, which it seems is what Mac OSX is doing, which
seems useless for our purposes.  Either this is a hard problem or we're
nowhere near a target use case.

Have a nice day,
-- 
Martijn van Oosterhout   klep...@svana.org   http://svana.org/kleptog/
 He who writes carelessly confesses thereby at the very outset that he does
 not attach much importance to his own thoughts.
   -- Arthur Schopenhauer


signature.asc
Description: Digital signature

Re: [HACKERS] sortsupport for text

2012-03-19 Thread Greg Stark

On Mon, Mar 19, 2012 at 9:23 PM, Martijn van Oosterhout
klep...@svana.org wrote:
 Ouch. I was holding out hope that you could get a meaningful
 improvement if we could use the first X bytes of the strxfrm output so
 you only need to do a strcoll on strings that actually nearly match.
 But with an information density of 9 bytes for one 1 character it
 doesn't seem worthwhile.

When I was playing with glibc it was 4n. I think what they do is have
n bytes for the high order bits, then n bytes for low order bits like
capitalization or whitespace differences. I suspect they used to use
16 bits for each and have gone to some larger size.


 That and this gem in the strxfrm manpage:

 RETURN VALUE
       The  strxfrm()  function returns the number of bytes required to
       store the transformed string in dest excluding the terminating
       '\0' character.  If the value returned is n or more, the
       contents of dest are indeterminate.

 Which means that you have to take the entire transformed string, you
 can't just ask for the first bit. I think that kind of leaves the whole
 idea dead in the water.

I believe the intended API is that you allocate a buffer with your
guess of the right size, call strxfrm and if it returns a larger
number you realloc your buffer and call it again.


-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sortsupport for text

2012-03-18 Thread Tom Lane

Greg Stark st...@mit.edu writes:
 I'm still curious how it would compare to call strxfrm and sort the
 resulting binary blobs.

In principle that should be a win; it's hard to believe that strxfrm
would have gotten into the standards if it were not a win for sorting
applications.

 I don't think the sortsupport stuff actually
 makes this any easier though. Since using it requires storing the
 binary blob somewhere I think the support would have to be baked into
 tuplesort (or hacked into the sortkey as an expr that was evaluated
 earlier somehow).

Well, obviously something has to be done, but I think it might be
possible to express this as another sortsupport API function rather than
doing anything as ugly as hardwiring strxfrm into the callers.

However, it occurred to me that we could pretty easily jury-rig
something that would give us an idea about the actual benefit available
here.  To wit: make a C function that wraps strxfrm, basically
strxfrm(text) returns bytea.  Then compare the performance of
ORDER BY text_col to ORDER BY strxfrm(text_col).

(You would need to have either both or neither of text and bytea
using the sortsupport code paths for this to be a fair comparison.)

One other thing I've always wondered about in this connection is the
general performance of sorting toasted datums.  Is it better to detoast
them in every comparison, or pre-detoast to save comparison cycles at
the cost of having to push much more data around?  I didn't see any
discussion of this point in Robert's benchmarks, but I don't think we
should go very far towards enabling sortsupport for text until we
understand the issue and know whether we need to add more infrastructure
for it.  If you cross your eyes a little bit, this is very much like
the strxfrm question...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sortsupport for text

2012-03-17 Thread Greg Stark

On Fri, Mar 2, 2012 at 8:45 PM, Robert Haas robertmh...@gmail.com wrote:
 12789    28.2686  libc-2.13.so             strcoll_l
 6802     15.0350  postgres                 text_cmp

I'm still curious how it would compare to call strxfrm and sort the
resulting binary blobs. I don't think the sortsupport stuff actually
makes this any easier though. Since using it requires storing the
binary blob somewhere I think the support would have to be baked into
tuplesort (or hacked into the sortkey as an expr that was evaluated
earlier somehow).

It's a tradeoff and not an obvious one. The binary blobs are larger
and it would mean reading and copying more data around memory. But it
would mean doing the work that strcoll_l does only n times instead of
nlogn times. That might be a pretty significant gain.

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sortsupport for text

2012-03-08 Thread Noah Misch

On Fri, Mar 02, 2012 at 03:45:38PM -0500, Robert Haas wrote:
 SELECT SUM(1) FROM (SELECT * FROM randomtext ORDER BY t) x;
 
 On unpatched master, this takes about 416 ms (plus or minus a few).
 With the attached patch, it takes about 389 ms (plus or minus a very
 few), a speedup of about 7%.
 
 I repeated the experiment using the C locale, like this:
 
 SELECT SUM(1) FROM (SELECT * FROM randomtext ORDER BY t COLLATE C) x;
 
 Here, it takes about 202 ms with the patch, and about 231 ms on
 unpatched master, a savings of about 13%.

 [oprofile report, further discussion]

Thanks for looking into this.  Your patch is also a nice demonstration of
sortsupport's ability to help with more than just fmgr overhead.

 Considering all that, I
 had hoped for more like a 15-20% gain from this approach, but it
 didn't happen, I suppose because some of the instructions saved just
 resulted in more processor stalls.  All the same, I'm inclined to
 think it's still worth doing.

This is a border case, but I suggest that a 13% speedup on a narrowly-tailored
benchmark, degrading to 7% in common configurations, is too meager to justify
adopting this patch.

nm

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sortsupport for text

2012-03-08 Thread Kevin Grittner

Noah Misch n...@leadboat.com wrote:
 On Fri, Mar 02, 2012 at 03:45:38PM -0500, Robert Haas wrote:
 
 SELECT SUM(1) FROM (SELECT * FROM randomtext ORDER BY t) x;
 
 [13% faster with patch for C collation; 7% faster for UTF8]
 
 I had hoped for more like a 15-20% gain from this approach, but
 it didn't happen, I suppose because some of the instructions
 saved just resulted in more processor stalls.  All the same, I'm
 inclined to think it's still worth doing.
 
 This is a border case, but I suggest that a 13% speedup on a 
 narrowly-tailored benchmark, degrading to 7% in common
 configurations, is too meager to justify adopting this patch.
 
We use the C collation and have character strings in most indexes
and ORDER BY clauses.  Unless there are significant contra-
indications, I'm in favor of adopting this patch.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

74 matches

Mail list logo