Re: Compressed TOAST Slicing

2019-04-16 Thread Andrey Borodin
> 9 апр. 2019 г., в 22:30, Tom Lane написал(а): > > The proposal is kind of cute, but I'll bet it's a net loss for > small copy lengths --- likely we'd want some cutoff below which > we do it with the dumb byte-at-a-time loop. Ture. I've made simple extension to compare decompression time on

Re: Compressed TOAST Slicing

2019-04-09 Thread Tom Lane
Andres Freund writes: > On 2019-04-09 10:12:56 -0700, Paul Ramsey wrote: >> Wow, well beyond slicing, just being able to decompress 25% faster is a win >> for pretty much any TOAST use case. I guess the $100 question is: >> portability? The whole reason for the old-skool code that’s there now

Re: Compressed TOAST Slicing

2019-04-09 Thread Andrey Borodin
> 9 апр. 2019 г., в 22:20, Andres Freund написал(а): > > Just use memmove? It's usually as fast these days. No, unfortunately, it is fixing things incompatible way. In pglz side-effects of overlapping addresses are necessary, not the way memmove avoids it. I.e. bytes 01234 ^ copy here

Re: Compressed TOAST Slicing

2019-04-09 Thread Andres Freund
On 2019-04-09 10:12:56 -0700, Paul Ramsey wrote: > > > On Apr 9, 2019, at 10:09 AM, Andrey Borodin wrote: > > > > He advised me to use algorithm that splits copied regions into smaller > > non-overlapping subregions with exponentially increasing size. > > > > while (off <= len) > > { > >

Re: Compressed TOAST Slicing

2019-04-09 Thread Andrey Borodin
> 9 апр. 2019 г., в 22:12, Paul Ramsey написал(а): > > Wow, well beyond slicing, just being able to decompress 25% faster is a win > for pretty much any TOAST use case. I guess the $100 question is: > portability? The whole reason for the old-skool code that’s there now was > concerns

Re: Compressed TOAST Slicing

2019-04-09 Thread Paul Ramsey
> On Apr 9, 2019, at 10:09 AM, Andrey Borodin wrote: > > He advised me to use algorithm that splits copied regions into smaller > non-overlapping subregions with exponentially increasing size. > > while (off <= len) > { >memcpy(dp, dp - off, off); >len -= off; >dp += off; >

Re: Compressed TOAST Slicing

2019-04-09 Thread Andrey Borodin
Hi! > 12 марта 2019 г., в 10:22, Andrey Borodin написал(а): > > 3. And I'd use memmove despite the comment why we do not do that. It is > SSE-optimized and cache-optimized nowadays. So, I've pushed idea a little bit and showed that decompress byte-copy cycle to Vladimir Leskov. while

Re: Compressed TOAST Slicing

2019-04-02 Thread Stephen Frost
Greetings, * Darafei "Komяpa" Praliaskouski (m...@komzpa.net) wrote: > > I'll plan to push this tomorrow with the above change (and a few > > additional comments to explain what all is going on..). > > Is everything ok? Can it be pushed? This has been pushed now. Thanks! Stephen

Re: Compressed TOAST Slicing

2019-04-02 Thread Komяpa
Hi! > I'll plan to push this tomorrow with the above change (and a few > additional comments to explain what all is going on..). Is everything ok? Can it be pushed? I'm looking here, haven't found it pushed and worry about this. https://github.com/postgres/postgres/commits/master

Re: Compressed TOAST Slicing

2019-03-30 Thread Stephen Frost
Greetings, * Paul Ramsey (pram...@cleverelephant.ca) wrote: > > On Mar 19, 2019, at 4:47 AM, Stephen Frost wrote: > > * Paul Ramsey (pram...@cleverelephant.ca) wrote: > >>> On Mar 18, 2019, at 7:34 AM, Robert Haas wrote: > >>> +1. I think Paul had it right originally. > >> > >> In that

Re: Compressed TOAST Slicing

2019-03-19 Thread Paul Ramsey
> On Mar 19, 2019, at 4:47 AM, Stephen Frost wrote: > > Greetings, > > * Paul Ramsey (pram...@cleverelephant.ca) wrote: >>> On Mar 18, 2019, at 7:34 AM, Robert Haas wrote: >>> +1. I think Paul had it right originally. >> >> In that spirit, here is a “one pglz_decompress function, new

Re: Compressed TOAST Slicing

2019-03-19 Thread Stephen Frost
Greetings, * Paul Ramsey (pram...@cleverelephant.ca) wrote: > > On Mar 18, 2019, at 7:34 AM, Robert Haas wrote: > > +1. I think Paul had it right originally. > > In that spirit, here is a “one pglz_decompress function, new parameter” > version for commit. Alright, I've been working through

Re: Compressed TOAST Slicing

2019-03-18 Thread Paul Ramsey
> On Mar 18, 2019, at 7:34 AM, Robert Haas wrote: > > On Mon, Mar 18, 2019 at 10:14 AM Tom Lane wrote: >> Stephen Frost writes: >>> * Andres Freund (and...@anarazel.de) wrote: I don't think that should stop us from breaking the API. You've got to do quite low level stuff to need

Re: Compressed TOAST Slicing

2019-03-18 Thread Robert Haas
On Mon, Mar 18, 2019 at 10:14 AM Tom Lane wrote: > Stephen Frost writes: > > * Andres Freund (and...@anarazel.de) wrote: > >> I don't think that should stop us from breaking the API. You've got to > >> do quite low level stuff to need pglz directly, in which case such an > >> API change should

Re: Compressed TOAST Slicing

2019-03-18 Thread Tom Lane
Stephen Frost writes: > * Andres Freund (and...@anarazel.de) wrote: >> I don't think that should stop us from breaking the API. You've got to >> do quite low level stuff to need pglz directly, in which case such an >> API change should be the least of your problems between major versions. >

Re: Compressed TOAST Slicing

2019-03-17 Thread Stephen Frost
Greetings, * Andres Freund (and...@anarazel.de) wrote: > On 2019-03-12 14:42:14 +0900, Michael Paquier wrote: > > On Mon, Mar 11, 2019 at 08:38:56PM +, Regina Obe wrote: > > > I tested on windows mingw64 (as of a week ago) and confirmed the > > > patch applies cleanly and significantly faster

Re: Compressed TOAST Slicing

2019-03-13 Thread Paul Ramsey
> On Mar 13, 2019, at 9:32 AM, Andrey Borodin wrote: > > > >> 13 марта 2019 г., в 21:05, Paul Ramsey >> написал(а): >> >> Here is a new (final?) patch ... >> >> > > This check > > @@ -744,6 +748,8 @@ pglz_decompress(const char *source, int32 slen, char > *dest, >

Re: Compressed TOAST Slicing

2019-03-13 Thread Andrey Borodin
> 13 марта 2019 г., в 21:05, Paul Ramsey написал(а): > > Here is a new (final?) patch ... > > This check @@ -744,6 +748,8 @@ pglz_decompress(const char *source, int32 slen, char *dest, { *dp = dp[-off];

Re: Compressed TOAST Slicing

2019-03-13 Thread Paul Ramsey
On Mar 13, 2019, at 8:25 AM, Paul Ramsey wrote:On Mar 13, 2019, at 3:09 AM, Tomas Vondra wrote:On 3/13/19 3:19 AM, Michael Paquier wrote:On Tue, Mar 12, 2019 at 07:01:17PM -0700, Andres Freund wrote:I don't think this is even close to

Re: Compressed TOAST Slicing

2019-03-13 Thread Paul Ramsey
> On Mar 13, 2019, at 3:09 AM, Tomas Vondra > wrote: > > On 3/13/19 3:19 AM, Michael Paquier wrote: >> On Tue, Mar 12, 2019 at 07:01:17PM -0700, Andres Freund wrote: >>> I don't think this is even close to popular enough to incur the >>> maybe of a separate function / more complicated

Re: Compressed TOAST Slicing

2019-03-13 Thread Tomas Vondra
On 3/13/19 3:19 AM, Michael Paquier wrote: > On Tue, Mar 12, 2019 at 07:01:17PM -0700, Andres Freund wrote: >> I don't think this is even close to popular enough to incur the >> maybe of a separate function / more complicated interface. By this >> logic we can change basically no APIs anymore. >

Re: Compressed TOAST Slicing

2019-03-12 Thread Michael Paquier
On Tue, Mar 12, 2019 at 07:01:17PM -0700, Andres Freund wrote: > I don't think this is even close to popular enough to incur the > maybe of a separate function / more complicated interface. By this > logic we can change basically no APIs anymore. Well, if folks here think that it is not worth

Re: Compressed TOAST Slicing

2019-03-12 Thread Andres Freund
On March 12, 2019 6:58:12 PM PDT, Michael Paquier wrote: >On Tue, Mar 12, 2019 at 11:08:15AM -0700, Paul Ramsey wrote: >>> On Mar 12, 2019, at 9:45 AM, Paul Ramsey >wrote: >>> I was going to say that the function is only used twice in the code >>> base, but I see it’s now used four times. So

Re: Compressed TOAST Slicing

2019-03-12 Thread Michael Paquier
On Tue, Mar 12, 2019 at 11:08:15AM -0700, Paul Ramsey wrote: >> On Mar 12, 2019, at 9:45 AM, Paul Ramsey wrote: >> I was going to say that the function is only used twice in the code >> base, but I see it’s now used four times. So maybe leave the old >> signature in place and add the new one for

Re: Compressed TOAST Slicing

2019-03-12 Thread Paul Ramsey
> On Mar 11, 2019, at 10:22 PM, Andrey Borodin wrote: > > Hi! > >> 21 февр. 2019 г., в 23:50, Paul Ramsey >> написал(а): >> >> Merci! Attached are updated patches. >> > > As noted before, patches are extremely useful. > So, I've looked into the code too. > > I've got some questions

Re: Compressed TOAST Slicing

2019-03-12 Thread Paul Ramsey
> On Mar 12, 2019, at 9:45 AM, Paul Ramsey wrote: > > > >> On Mar 12, 2019, at 9:13 AM, Andres Freund wrote: >> >> On 2019-03-12 14:42:14 +0900, Michael Paquier wrote: >>> On Mon, Mar 11, 2019 at 08:38:56PM +, Regina Obe wrote: I tested on windows mingw64 (as of a week ago) and

Re: Compressed TOAST Slicing

2019-03-12 Thread Paul Ramsey
> On Mar 12, 2019, at 9:13 AM, Andres Freund wrote: > > On 2019-03-12 14:42:14 +0900, Michael Paquier wrote: >> On Mon, Mar 11, 2019 at 08:38:56PM +, Regina Obe wrote: >>> I tested on windows mingw64 (as of a week ago) and confirmed the >>> patch applies cleanly and significantly faster

Re: Compressed TOAST Slicing

2019-03-12 Thread Andres Freund
On 2019-03-12 14:42:14 +0900, Michael Paquier wrote: > On Mon, Mar 11, 2019 at 08:38:56PM +, Regina Obe wrote: > > I tested on windows mingw64 (as of a week ago) and confirmed the > > patch applies cleanly and significantly faster for left, substr > > tests than head. > > int32 >

Re: Compressed TOAST Slicing

2019-03-12 Thread Andrey Borodin
> 12 марта 2019 г., в 19:40, Paul Ramsey написал(а): > >> On Mar 11, 2019, at 10:42 PM, Michael Paquier wrote: >> >> int32 >> pglz_decompress(const char *source, int32 slen, char *dest, >> - int32 rawsize) >> + int32 rawsize, bool

Re: Compressed TOAST Slicing

2019-03-12 Thread Paul Ramsey
> On Mar 11, 2019, at 10:42 PM, Michael Paquier wrote: > > On Mon, Mar 11, 2019 at 08:38:56PM +, Regina Obe wrote: >> I tested on windows mingw64 (as of a week ago) and confirmed the >> patch applies cleanly and significantly faster for left, substr >> tests than head. > > int32 >

Re: Compressed TOAST Slicing

2019-03-11 Thread Michael Paquier
On Mon, Mar 11, 2019 at 08:38:56PM +, Regina Obe wrote: > I tested on windows mingw64 (as of a week ago) and confirmed the > patch applies cleanly and significantly faster for left, substr > tests than head. int32 pglz_decompress(const char *source, int32 slen, char *dest, -

Re: Compressed TOAST Slicing

2019-03-11 Thread Andrey Borodin
Hi! > 21 февр. 2019 г., в 23:50, Paul Ramsey написал(а): > > Merci! Attached are updated patches. > As noted before, patches are extremely useful. So, I've looked into the code too. I've got some questions about pglz_decompress() changes: 1. + if (dp >=

Re: Compressed TOAST Slicing

2019-03-11 Thread Regina Obe
The following review has been posted through the commitfest application: make installcheck-world: tested, passed Implements feature: tested, passed Spec compliant: tested, passed Documentation:not tested No need for documentation as this is a performance improvement

Re: Compressed TOAST Slicing

2019-03-11 Thread Alvaro Herrera
On 2019-Mar-11, Darafei Praliaskouski wrote: > The feature is super valuable for complex PostGIS-enabled databases. After having to debug a perf problem in this area, I agree, +1 for the patch. Thanks -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7

Re: Compressed TOAST Slicing

2019-03-11 Thread Darafei Praliaskouski
The following review has been posted through the commitfest application: make installcheck-world: not tested Implements feature: tested, passed Spec compliant: not tested Documentation:not tested I have read the patch and have no problems with it. The feature is

Re: Compressed TOAST Slicing

2019-02-21 Thread Paul Ramsey
On Wed, Feb 20, 2019 at 1:12 PM Stephen Frost wrote: > > * Paul Ramsey (pram...@cleverelephant.ca) wrote: > > On Wed, Feb 20, 2019 at 10:50 AM Daniel Verite > > wrote: > > > > > > What about starts_with(string, prefix)? > > Thanks, I'll add that. > > That sounds good to me, I look forward to an

Re: Compressed TOAST Slicing

2019-02-20 Thread Stephen Frost
Greetings, * Paul Ramsey (pram...@cleverelephant.ca) wrote: > On Wed, Feb 20, 2019 at 10:50 AM Daniel Verite > wrote: > > > > Paul Ramsey wrote: > > > > > Oddly enough, I couldn't find many/any things that were sensitive to > > > left-end decompression. The only exception is "LIKE

Re: Compressed TOAST Slicing

2019-02-20 Thread Tomas Vondra
On 2/20/19 7:50 PM, Robert Haas wrote: > On Wed, Feb 20, 2019 at 1:45 PM Paul Ramsey wrote: >> What this does not support: any function that probably wants >> less-than-everything, but doesn’t know how big a slice to look >> for. Stephen thinks I should put an iterator on decompression, >>

Re: Compressed TOAST Slicing

2019-02-20 Thread Tom Lane
Paul Ramsey writes: >> On Feb 20, 2019, at 10:37 AM, Simon Riggs wrote: >> If we add one set of code now and need to add another different one later, >> we will have 2 sets of code that do similar things. > Note that adding an iterator isn’t adding two ways to do the same thing, > since the

Re: Compressed TOAST Slicing

2019-02-20 Thread Daniel Verite
Paul Ramsey wrote: > > text_starts_with(arg1,arg2) in varlena.c does a full decompression > > of arg1 when it could limit itself to the length of the smaller arg2: > > Nice catch, I didn't find that one as it's not user visible, seems to > be only called in spgist (!!) It's also

Re: Compressed TOAST Slicing

2019-02-20 Thread Paul Ramsey
On Wed, Feb 20, 2019 at 10:50 AM Daniel Verite wrote: > > Paul Ramsey wrote: > > > Oddly enough, I couldn't find many/any things that were sensitive to > > left-end decompression. The only exception is "LIKE this%" which > > clearly would be helped, but unfortunately wouldn't be a quick >

Re: Compressed TOAST Slicing

2019-02-20 Thread Robert Haas
On Wed, Feb 20, 2019 at 1:45 PM Paul Ramsey wrote: > What this does not support: any function that probably wants > less-than-everything, but doesn’t know how big a slice to look for. Stephen > thinks I should put an iterator on decompression, which would be an > interesting piece of work.

Re: Compressed TOAST Slicing

2019-02-20 Thread Daniel Verite
Paul Ramsey wrote: > Oddly enough, I couldn't find many/any things that were sensitive to > left-end decompression. The only exception is "LIKE this%" which > clearly would be helped, but unfortunately wouldn't be a quick > drop-in, but a rather major reorganization of the regex handling.

Re: Compressed TOAST Slicing

2019-02-20 Thread Paul Ramsey
> On Feb 20, 2019, at 10:37 AM, Simon Riggs wrote: > > -1, I think this is blowing up the complexity of a already useful patch, > even though there's no increase in complexity due to the patch proposed > here. I totally get wanting incremental decompression for jsonb, but I > don't see why

Re: Compressed TOAST Slicing

2019-02-20 Thread Simon Riggs
On Wed, 20 Feb 2019 at 16:27, Andres Freund wrote: > > > Sure, but we have the choice between something that benefits just a few > > cases or one that benefits more widely. > > > > If we all only work on the narrow use cases that are right in front of us > > at the present moment then we would

Re: Compressed TOAST Slicing

2019-02-20 Thread Robert Haas
On Wed, Feb 20, 2019 at 11:27 AM Andres Freund wrote: > -1, I think this is blowing up the complexity of a already useful patch, > even though there's no increase in complexity due to the patch proposed > here. I totally get wanting incremental decompression for jsonb, but I > don't see why Paul

Re: Compressed TOAST Slicing

2019-02-20 Thread Andres Freund
On 2019-02-20 08:39:38 +, Simon Riggs wrote: > On Tue, 19 Feb 2019 at 23:09, Paul Ramsey wrote: > > > On Sat, Feb 16, 2019 at 7:25 AM Simon Riggs wrote: > > > > > Could we get an similarly optimized implementation of -> operator for > > JSONB as well? > > > Are there any other potential

Re: Compressed TOAST Slicing

2019-02-20 Thread Simon Riggs
On Tue, 19 Feb 2019 at 23:09, Paul Ramsey wrote: > On Sat, Feb 16, 2019 at 7:25 AM Simon Riggs wrote: > > > Could we get an similarly optimized implementation of -> operator for > JSONB as well? > > Are there any other potential uses? Best to fix em all up at once and > then move on to other

Re: Compressed TOAST Slicing

2019-02-19 Thread Юрий Соколов
Some time ago I posted PoC patch with alternative TOAST compression scheme: instead of "compress-then-chunk" I suggested "chunk-then-compress". It decrease compression level, but allows efficient arbitrary slicing. ср, 20 февр. 2019 г., 2:09 Paul Ramsey pram...@cleverelephant.ca: > On Sat, Feb

Re: Compressed TOAST Slicing

2019-02-19 Thread Paul Ramsey
On Sat, Feb 16, 2019 at 7:25 AM Simon Riggs wrote: > Could we get an similarly optimized implementation of -> operator for JSONB > as well? > Are there any other potential uses? Best to fix em all up at once and then > move on to other things. Thanks. Oddly enough, I couldn't find many/any

Re: Compressed TOAST Slicing

2019-02-16 Thread Simon Riggs
On Thu, 6 Dec 2018 at 20:54, Paul Ramsey wrote: > On Sun, Dec 2, 2018 at 7:03 AM Rafia Sabih > wrote: > > > > The idea looks good and believing your performance evaluation it seems > > like a practical one too. > > Thank you kindly for the review! > Sounds good. Could we get an similarly

Re: Compressed TOAST Slicing

2019-02-15 Thread Andres Freund
Hi Stephen, On 2018-12-06 12:54:18 -0800, Paul Ramsey wrote: > On Sun, Dec 2, 2018 at 7:03 AM Rafia Sabih > wrote: > > > > The idea looks good and believing your performance evaluation it seems > > like a practical one too. > > Thank you kindly for the review! > > > A comment explaining how

Re: Compressed TOAST Slicing

2018-12-06 Thread Paul Ramsey
On Sun, Dec 2, 2018 at 7:03 AM Rafia Sabih wrote: > > The idea looks good and believing your performance evaluation it seems > like a practical one too. Thank you kindly for the review! > A comment explaining how this check differs for is_slice case would be helpful. > Looks like PG indentation

Re: Compressed TOAST Slicing

2018-12-02 Thread Rafia Sabih
On Fri, Nov 2, 2018 at 11:55 PM Paul Ramsey wrote: > > As threatened, I have also added a patch to left() to also use sliced access. Hi Paul, The idea looks good and believing your performance evaluation it seems like a practical one too. I had a look at this patch and here are my initial

Re: Compressed TOAST Slicing

2018-11-02 Thread Paul Ramsey
As threatened, I have also added a patch to left() to also use sliced access. compressed-datum-slicing-20190102a.patch Description: Binary data compressed-datum-slicing-left-20190102a.patch Description: Binary data

Re: Compressed TOAST Slicing

2018-11-02 Thread Paul Ramsey
On Thu, Nov 1, 2018 at 4:02 PM Tom Lane wrote: > Paul Ramsey writes: > > On Thu, Nov 1, 2018 at 2:29 PM Stephen Frost wrote: > >> and secondly, why we wouldn't consider > >> handling a non-zero offset. A non-zero offset would, of course, still > >> require decompressing from the start and

Re: Compressed TOAST Slicing

2018-11-01 Thread Tom Lane
Paul Ramsey writes: > On Thu, Nov 1, 2018 at 2:29 PM Stephen Frost wrote: >> and secondly, why we wouldn't consider >> handling a non-zero offset. A non-zero offset would, of course, still >> require decompressing from the start and then just throwing away what we >> skip over, but we're going

Re: Compressed TOAST Slicing

2018-11-01 Thread Paul Ramsey
On Thu, Nov 1, 2018 at 2:29 PM Stephen Frost wrote: > Greetings, > > * Paul Ramsey (pram...@cleverelephant.ca) wrote: > > The attached patch adds in a code path to do a partial decompression of > the > > TOAST entry, when the requested slice is at the start of the object. > > There two things

Re: Compressed TOAST Slicing

2018-11-01 Thread Stephen Frost
Greetings, * Paul Ramsey (pram...@cleverelephant.ca) wrote: > The attached patch adds in a code path to do a partial decompression of the > TOAST entry, when the requested slice is at the start of the object. Neat! > As usual, doing less work is faster. Definitely. > Interesting note to

Compressed TOAST Slicing

2018-11-01 Thread Paul Ramsey
Currently, PG_DETOAST_DATUM_SLICE when run on a compressed TOAST entry will first decompress the whole object, then extract the relevant slice. When the desired slice is at or near the front of the object, this is obviously non-optimal. The attached patch adds in a code path to do a partial