Re: [HACKERS] Remove 1MB size limit in tsvector

2017-09-11 Thread Tomas Vondra
On 09/11/2017 01:54 PM, Robert Haas wrote: > On Mon, Sep 11, 2017 at 5:33 AM, Ildus Kurbangaliev > wrote: >> Moreover, RUM index >> stores positions + lexemes, so it doesn't need tsvectors for ranked >> search. As a result, tsvector becomes a storage for >> building indexes (indexable type), not s

Re: [HACKERS] Remove 1MB size limit in tsvector

2017-09-11 Thread Robert Haas
On Mon, Sep 11, 2017 at 5:33 AM, Ildus Kurbangaliev wrote: > Moreover, RUM index > stores positions + lexemes, so it doesn't need tsvectors for ranked > search. As a result, tsvector becomes a storage for > building indexes (indexable type), not something that should be used at > runtime. And the

Re: [HACKERS] Remove 1MB size limit in tsvector

2017-09-11 Thread Ildus Kurbangaliev
On Thu, 7 Sep 2017 23:08:14 +0200 Tomas Vondra wrote: > Hi, > > On 08/17/2017 12:23 PM, Ildus Kurbangaliev wrote: > > In my benchmarks when database fits into buffers (so it's > > measurement of the time required for the tsvectors conversion) it > > gives me these results: > > > > Without conve

Re: [HACKERS] Remove 1MB size limit in tsvector

2017-09-07 Thread Tomas Vondra
Hi, On 08/17/2017 12:23 PM, Ildus Kurbangaliev wrote: > In my benchmarks when database fits into buffers (so it's measurement of > the time required for the tsvectors conversion) it gives me these > results: > > Without conversion: > > $ ./tsbench2 -database test1 -bench_time 300 > 2017/08/17 12

Re: [HACKERS] Remove 1MB size limit in tsvector

2017-08-17 Thread Ildus Kurbangaliev
On Thu, 10 Aug 2017 18:06:17 +0300 Alexander Korotkov wrote: > On Wed, Aug 9, 2017 at 7:38 PM, Robert Haas > wrote: > > > On Tue, Aug 1, 2017 at 4:00 PM, Ildus K > > wrote: > > > It's a workaround. DatumGetTSVector and > > > DatumGetTSVectorCopy will upgrade tsvector on the fly if it > > > h

Re: [HACKERS] Remove 1MB size limit in tsvector

2017-08-16 Thread Ildus Kurbangaliev
On Thu, 10 Aug 2017 11:46:55 -0400 Tom Lane wrote: > Alexander Korotkov writes: > > ... > > You have random mix of tabs and spaces here. > > It's worth running pgindent over your code before submitting. It > should be pretty easy to set that up nowadays, see > src/tools/pgindent/README. (If

Re: [HACKERS] Remove 1MB size limit in tsvector

2017-08-10 Thread Tom Lane
Alexander Korotkov writes: > ... > You have random mix of tabs and spaces here. It's worth running pgindent over your code before submitting. It should be pretty easy to set that up nowadays, see src/tools/pgindent/README. (If you find any portability problems while trying to install pgindent, p

Re: [HACKERS] Remove 1MB size limit in tsvector

2017-08-10 Thread Alexander Korotkov
On Thu, Aug 10, 2017 at 7:37 AM, Michael Paquier wrote: > On Wed, Aug 9, 2017 at 6:38 PM, Robert Haas wrote: > > The patch doesn't really conform to our coding standards, though, so > > you need to clean it up (or, if you're not sure what you need to do, > > you need to have someone who knows ho

Re: [HACKERS] Remove 1MB size limit in tsvector

2017-08-10 Thread Alexander Korotkov
On Wed, Aug 9, 2017 at 7:38 PM, Robert Haas wrote: > On Tue, Aug 1, 2017 at 4:00 PM, Ildus K > wrote: > > It's a workaround. DatumGetTSVector and > > DatumGetTSVectorCopy will upgrade tsvector on the fly if it > > has old format. > > Hmm, that seems like a real fix, not just a workaround. If yo

Re: [HACKERS] Remove 1MB size limit in tsvector

2017-08-09 Thread Michael Paquier
On Wed, Aug 9, 2017 at 6:38 PM, Robert Haas wrote: > The patch doesn't really conform to our coding standards, though, so > you need to clean it up (or, if you're not sure what you need to do, > you need to have someone who knows how PostgreSQL code needs to look > review it for you). The documen

Re: [HACKERS] Remove 1MB size limit in tsvector

2017-08-09 Thread Robert Haas
On Tue, Aug 1, 2017 at 4:00 PM, Ildus K wrote: > It's a workaround. DatumGetTSVector and > DatumGetTSVectorCopy will upgrade tsvector on the fly if it > has old format. Hmm, that seems like a real fix, not just a workaround. If you can transparently read the old format, there's no problem. Not

Re: [HACKERS] Remove 1MB size limit in tsvector

2017-08-09 Thread Ildus Kurbangaliev
On Wed, 9 Aug 2017 09:01:44 +0200 Torsten Zuehlsdorff wrote: > On 01.08.2017 22:00, Ildus K wrote: > > On Tue, 1 Aug 2017 15:33:08 -0400 > > Robert Haas wrote: > > > >> On Tue, Aug 1, 2017 at 3:10 PM, Ildus K > >> wrote: > So this would break pg_upgrade for tsvector columns? > >>> >

Re: [HACKERS] Remove 1MB size limit in tsvector

2017-08-09 Thread Torsten Zuehlsdorff
On 01.08.2017 22:00, Ildus K wrote: On Tue, 1 Aug 2017 15:33:08 -0400 Robert Haas wrote: On Tue, Aug 1, 2017 at 3:10 PM, Ildus K wrote: So this would break pg_upgrade for tsvector columns? I added a function that will convert old tsvectors on the fly. It's the approach used in hstore bef

Re: [HACKERS] Remove 1MB size limit in tsvector

2017-08-01 Thread Ildus K
On Tue, 1 Aug 2017 15:33:08 -0400 Robert Haas wrote: > On Tue, Aug 1, 2017 at 3:10 PM, Ildus K > wrote: > >> So this would break pg_upgrade for tsvector columns? > > > > I added a function that will convert old tsvectors on the fly. It's > > the approach used in hstore before. > > Does that

Re: [HACKERS] Remove 1MB size limit in tsvector

2017-08-01 Thread Robert Haas
On Tue, Aug 1, 2017 at 3:10 PM, Ildus K wrote: >> So this would break pg_upgrade for tsvector columns? > > I added a function that will convert old tsvectors on the fly. It's the > approach used in hstore before. Does that mean the answer to the question that I asked is "yes, but I have a workaro

Re: [HACKERS] Remove 1MB size limit in tsvector

2017-08-01 Thread Ildus K
On Tue, 1 Aug 2017 14:56:54 -0400 Robert Haas wrote: > On Tue, Aug 1, 2017 at 10:08 AM, Ildus Kurbangaliev > wrote: > > Historically tsvector type can't hold more than 1MB data. > > I want to propose a patch that removes that limit. > > > > That limit is created by 'pos' field from WordEntry, wh

Re: [HACKERS] Remove 1MB size limit in tsvector

2017-08-01 Thread Robert Haas
On Tue, Aug 1, 2017 at 10:08 AM, Ildus Kurbangaliev wrote: > Historically tsvector type can't hold more than 1MB data. > I want to propose a patch that removes that limit. > > That limit is created by 'pos' field from WordEntry, which have only > 20 bits for storage. > > In the proposed patch I re

[HACKERS] Remove 1MB size limit in tsvector

2017-08-01 Thread Ildus Kurbangaliev
Hello, hackers! Historically tsvector type can't hold more than 1MB data. I want to propose a patch that removes that limit. That limit is created by 'pos' field from WordEntry, which have only 20 bits for storage. In the proposed patch I removed this field and instead of it I keep offsets only