On Nov4, 2011, at 11:15 , Yoann Moreau wrote: > On 03/11/11 19:19, Florian Pflug wrote: >> Postgres doesn't seem to contain such a function currently (don't believe >> that, >> though - go and recheck the documentation. I don't know all thousands of >> built-in >> functions by heart). But it's easy to add one. You could either use PL/pgSQL >> to parse the tsvector's textual representation, or write a C function. If you >> go the PL/pgSQL route, regexp_split_to_table() might come in handy. > > This seems easier to program than what I was thinking about, I'm going to do > that. > But I'm wondering about size of database with the GIN index plus the tsvector > column, > and performance about parsing the whole tsvectors for each document I need > positions > from (as I need them for a very few terms).
AFAICS, the internal storage layout of tsvector should allow you to extract an individual lexem's positions quite efficiently (with time complexity log(N) where N is the number of lexems in the tsvector). Doing so will require you to implement your function in C though - any solution that works from a tsvector's textual representation will obviously have time complexity N. best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers