Re: Speeding up text_position_next with multibyte encodings

2019-01-28 Thread Bruce Momjian
On Fri, Jan 25, 2019 at 04:33:54PM +0200, Heikki Linnakangas wrote: > On 15/01/2019 02:52, John Naylor wrote: > >The majority of cases are measurably faster, and the best case is at > >least 20x faster. On the whole I'd say this patch is a performance win > >even without further optimization. I'm

Re: Speeding up text_position_next with multibyte encodings

2019-01-25 Thread Heikki Linnakangas
On 15/01/2019 02:52, John Naylor wrote: The majority of cases are measurably faster, and the best case is at least 20x faster. On the whole I'd say this patch is a performance win even without further optimization. I'm marking it ready for committer. I read through the patch one more time,

Re: Speeding up text_position_next with multibyte encodings

2019-01-14 Thread John Naylor
On Sun, Dec 23, 2018 at 9:33 AM Tomas Vondra wrote: > So, what is the expected speedup in a "good/average" case? Do we have > some reasonable real-world workload mixing these cases that could be > used as a realistic benchmark? Not sure about a realistic mix, but I investigated the tradeoffs.

Re: Speeding up text_position_next with multibyte encodings

2018-12-26 Thread John Naylor
On 12/22/18, Heikki Linnakangas wrote: > On 14/12/2018 20:20, John Naylor wrote: > I'm afraid that script doesn't work as a performance test. The > position() function is immutable, so the result gets cached in the plan > cache. All you're measuring is the speed to get the constant from the >

Re: Speeding up text_position_next with multibyte encodings

2018-12-23 Thread Tomas Vondra
On 12/23/18 1:26 AM, Heikki Linnakangas wrote: > On 14/12/2018 20:20, John Naylor wrote: >> I signed up to be a reviewer, and I will be busy next month, so I went >> ahead and fixed the typo in the patch that broke assert-enabled >> builds. While at it, I standardized on the spelling

Re: Speeding up text_position_next with multibyte encodings

2018-12-22 Thread Heikki Linnakangas
On 23/12/2018 02:32, Heikki Linnakangas wrote: On 23/12/2018 02:28, Heikki Linnakangas wrote: On 14/12/2018 23:40, John Naylor wrote: I just noticed that the contrib/citext test fails. I've set the status to waiting on author. Hmm, it works for me. What failure did you see? Never mind, I'm

Re: Speeding up text_position_next with multibyte encodings

2018-12-22 Thread Heikki Linnakangas
On 23/12/2018 02:28, Heikki Linnakangas wrote: On 14/12/2018 23:40, John Naylor wrote: I just noticed that the contrib/citext test fails. I've set the status to waiting on author. Hmm, it works for me. What failure did you see? Never mind, I'm seeing it now, with assertions enabled. Thanks,

Re: Speeding up text_position_next with multibyte encodings

2018-12-22 Thread Heikki Linnakangas
On 14/12/2018 23:40, John Naylor wrote: I just noticed that the contrib/citext test fails. I've set the status to waiting on author. Hmm, it works for me. What failure did you see? - Heikki

Re: Speeding up text_position_next with multibyte encodings

2018-12-22 Thread Heikki Linnakangas
On 14/12/2018 20:20, John Naylor wrote: I signed up to be a reviewer, and I will be busy next month, so I went ahead and fixed the typo in the patch that broke assert-enabled builds. While at it, I standardized on the spelling "start_ptr" in a few places to match the rest of the file. It's a bit

Re: Speeding up text_position_next with multibyte encodings

2018-12-14 Thread John Naylor
On 12/14/18, John Naylor wrote: > I signed up to be a reviewer, and I will be busy next month, so I went > ahead and fixed the typo in the patch that broke assert-enabled > builds. While at it, I standardized on the spelling "start_ptr" in a > few places to match the rest of the file. It's a bit

Re: Speeding up text_position_next with multibyte encodings

2018-12-14 Thread John Naylor
On 11/30/18, Dmitry Dolgov <9erthali...@gmail.com> wrote: > Unfortunately, patch doesn't compile anymore due: > > varlena.c: In function ‘text_position_next_internal’: > varlena.c:1337:13: error: ‘start_ptr’ undeclared (first use in this > function) > Assert(start_ptr >= haystack && start_ptr <=

Re: Speeding up text_position_next with multibyte encodings

2018-11-30 Thread Dmitry Dolgov
> At Fri, 19 Oct 2018 15:52:59 +0300, Heikki Linnakangas wrote > Attached is a patch to speed up text_position_setup/next(), in some > common cases with multibyte encodings. Hi, Unfortunately, patch doesn't compile anymore due: varlena.c: In function ‘text_position_next_internal’:

Re: Speeding up text_position_next with multibyte encodings

2018-10-22 Thread Kyotaro HORIGUCHI
Hello. At Fri, 19 Oct 2018 15:52:59 +0300, Heikki Linnakangas wrote in <3173d989-bc1c-fc8a-3b69-f24246f73...@iki.fi> > Attached is a patch to speed up text_position_setup/next(), in some > common cases with multibyte encodings. > > text_position_next() uses the Boyer-Moore-Horspool search

Speeding up text_position_next with multibyte encodings

2018-10-19 Thread Heikki Linnakangas
Attached is a patch to speed up text_position_setup/next(), in some common cases with multibyte encodings. text_position_next() uses the Boyer-Moore-Horspool search algorithm, with a skip table. Currently, with a multi-byte encoding, we first convert both input strings to arrays of wchars,