Re: [HACKERS] Making strxfrm() blobs in indexes work

2014-02-12 Thread Peter Geoghegan
On Wed, Feb 12, 2014 at 3:30 PM, Martijn van Oosterhout wrote: > (A bit late to the party). This idea has come up before and the most > annoying thing is that braindead strxfrm api. Namely, to strxfrm a > large strings you need to strxfrm it completely even if you only want > the first 8 bytes.

Re: [HACKERS] Making strxfrm() blobs in indexes work

2014-02-12 Thread Martijn van Oosterhout
On Sun, Feb 02, 2014 at 05:09:06PM -0800, Peter Geoghegan wrote: > However, it also occurs to me that strxfrm() blobs have another useful > property: We (as, say, the author of an equality operator on text, an > operator intended for a btree operator class) *can* trust a strcmp()'s > result on blob

Re: [HACKERS] Making strxfrm() blobs in indexes work

2014-02-02 Thread Peter Geoghegan
On Thu, Jan 30, 2014 at 8:51 PM, Peter Geoghegan wrote: > I've done some more digging. It turns out that the 1977 paper "An > Encoding Method for Multifield Sorting and Indexing" describes a > technique that involves concatenating multiple column values and > comparing them using a simple strcmp()

Re: [HACKERS] Making strxfrm() blobs in indexes work

2014-01-30 Thread Peter Geoghegan
On Thu, Jan 30, 2014 at 5:05 PM, Peter Geoghegan wrote: > On Thu, Jan 30, 2014 at 5:04 PM, Tom Lane wrote: >>> That's not hard to prevent. If that should happen, we don't go with >>> the strxfrm() datum. We have a spare IndexTuple bit we could use to >>> mark when the optimization was applied. >>

Re: [HACKERS] Making strxfrm() blobs in indexes work

2014-01-30 Thread Peter Geoghegan
On Thu, Jan 30, 2014 at 3:49 PM, Peter Geoghegan wrote: > So ISTM that we could come up with an infrastructure, possibly just > for insertion scanKeys (limiting the code footprint of all of this) in > order to inner-page-process datums at this juncture, and store a blob > instead, for later saving

Re: [HACKERS] Making strxfrm() blobs in indexes work

2014-01-30 Thread Peter Geoghegan
On Thu, Jan 30, 2014 at 5:04 PM, Tom Lane wrote: >> That's not hard to prevent. If that should happen, we don't go with >> the strxfrm() datum. We have a spare IndexTuple bit we could use to >> mark when the optimization was applied. > > You'd need a bit per column, no? I don't think so. It would

Re: [HACKERS] Making strxfrm() blobs in indexes work

2014-01-30 Thread Peter Geoghegan
On Thu, Jan 30, 2014 at 4:45 PM, Peter Geoghegan wrote: > So we consider the > appropriateness of a regular strcoll() or a strxfrm() in all contexts > (in a generic and extensible manner, but that's essentially what we > do). I'm not too discouraged by this restriction, because in practice > it wo

Re: [HACKERS] Making strxfrm() blobs in indexes work

2014-01-30 Thread Tom Lane
Peter Geoghegan writes: > On Thu, Jan 30, 2014 at 4:34 PM, Tom Lane wrote: >> Quite aside from the index bloat risk, this effect means a 3-4x reduction >> in the maximum string length that can be indexed before getting the >> dreaded "Values larger than 1/3 of a buffer page cannot be indexed" err

Re: [HACKERS] Making strxfrm() blobs in indexes work

2014-01-30 Thread Peter Geoghegan
On Thu, Jan 30, 2014 at 4:34 PM, Tom Lane wrote: > Quite aside from the index bloat risk, this effect means a 3-4x reduction > in the maximum string length that can be indexed before getting the > dreaded "Values larger than 1/3 of a buffer page cannot be indexed" error. > Worse, a value insertion

Re: [HACKERS] Making strxfrm() blobs in indexes work

2014-01-30 Thread Tom Lane
Peter Geoghegan writes: > On more occasions than I care to recall, someone has suggested that it > would be valuable to do something with strxfrm() blobs in order to > have cheaper locale-aware text comparisons. One obvious place to do so > would be in indexes, but in the past that has been dismis