Re: [HACKERS] tsvector extraction patch
On Fri, Jul 3, 2009 at 3:01 AM, Hans-Juergen Schoenig -- PostgreSQL postg...@cybertec.at wrote: Hans-Juergen Schoenig -- PostgreSQL wrote: hello, this patch has not made it through yesterday, so i am trying to send it again. i made a small patch which i found useful for my personal tasks. it would be nice to see this in 8.5. if not core then maybe contrib. it transforms a tsvector to table format which is really nice for text processing and comparison. test=# SELECT * FROM tsvcontent(to_tsvector('english', 'i am pretty sure this is a good patch')); lex | rank +-- good | 8 patch | 9 pretti | 3 sure | 4 (4 rows) many thanks, hans Hmm, looks like we never did anything about this. Hans-Juergen, you should probably update this and add it to the open CommitFest if you want it to be considered for 8.5. https://commitfest.postgresql.org/action/commitfest_view/open ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tsvector extraction patch
Mike Rylander escribió: On Fri, Jul 3, 2009 at 3:49 AM, Hans-Juergen Schoenig -- PostgreSQLpostg...@cybertec.at wrote: test=# SELECT * FROM tsvcontent(to_tsvector('english', 'i am pretty sure this is a good patch')); lex | rank +-- good | 8 patch | 9 pretti | 3 sure | 4 (4 rows) This looks very useful! I wonder if providing a weight column would be relatively simple? I think this would present problems with the cast-to-text[] idea that Peter suggests, though. Where would the weight come from? -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Fwd: [HACKERS] tsvector extraction patch
Sorry, forgot to reply-all. -- Forwarded message -- From: Mike Rylander mrylan...@gmail.com Date: Wed, Jul 8, 2009 at 4:17 PM Subject: Re: [HACKERS] tsvector extraction patch To: Alvaro Herrera alvhe...@commandprompt.com On Wed, Jul 8, 2009 at 3:38 PM, Alvaro Herreraalvhe...@commandprompt.com wrote: Mike Rylander escribió: On Fri, Jul 3, 2009 at 3:49 AM, Hans-Juergen Schoenig -- PostgreSQLpostg...@cybertec.at wrote: test=# SELECT * FROM tsvcontent(to_tsvector('english', 'i am pretty sure this is a good patch')); lex | rank +-- good | 8 patch | 9 pretti | 3 sure | 4 (4 rows) This looks very useful! I wonder if providing a weight column would be relatively simple? I think this would present problems with the cast-to-text[] idea that Peter suggests, though. Where would the weight come from? From a tsvector column that has weights set via setweight(). -- Mike Rylander | VP, Research and Design | Equinox Software, Inc. / The Evergreen Experts | phone: 1-877-OPEN-ILS (673-6457) | email: mi...@esilibrary.com | web: http://www.esilibrary.com -- Mike Rylander | VP, Research and Design | Equinox Software, Inc. / The Evergreen Experts | phone: 1-877-OPEN-ILS (673-6457) | email: mi...@esilibrary.com | web: http://www.esilibrary.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tsvector extraction patch
On Friday 03 July 2009 10:49:41 Hans-Juergen Schoenig -- PostgreSQL wrote: hello, this patch has not made it through yesterday, so i am trying to send it again. i made a small patch which i found useful for my personal tasks. it would be nice to see this in 8.5. if not core then maybe contrib. it transforms a tsvector to table format which is really nice for text processing and comparison. test=# SELECT * FROM tsvcontent(to_tsvector('english', 'i am pretty sure this is a good patch')); lex | rank +-- good |8 patch |9 pretti |3 sure |4 (4 rows) Sounds useful. But in the interest of orthogonality (or whatever), how about instead you write a cast from tsvector to text[], and then you can use unnest() to convert that to a table, e.g., SELECT * FROM unnest(CAST(to_tsvector('...') AS text[])); -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tsvector extraction patch
On Fri, Jul 3, 2009 at 3:49 AM, Hans-Juergen Schoenig -- PostgreSQLpostg...@cybertec.at wrote: hello, this patch has not made it through yesterday, so i am trying to send it again. i made a small patch which i found useful for my personal tasks. it would be nice to see this in 8.5. if not core then maybe contrib. it transforms a tsvector to table format which is really nice for text processing and comparison. test=# SELECT * FROM tsvcontent(to_tsvector('english', 'i am pretty sure this is a good patch')); lex | rank +-- good | 8 patch | 9 pretti | 3 sure | 4 (4 rows) This looks very useful! I wonder if providing a weight column would be relatively simple? I think this would present problems with the cast-to-text[] idea that Peter suggests, though. -- Mike Rylander | VP, Research and Design | Equinox Software, Inc. / The Evergreen Experts | phone: 1-877-OPEN-ILS (673-6457) | email: mi...@esilibrary.com | web: http://www.esilibrary.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] tsvector extraction patch
hello, this patch has not made it through yesterday, so i am trying to send it again. i made a small patch which i found useful for my personal tasks. it would be nice to see this in 8.5. if not core then maybe contrib. it transforms a tsvector to table format which is really nice for text processing and comparison. test=# SELECT * FROM tsvcontent(to_tsvector('english', 'i am pretty sure this is a good patch')); lex | rank +-- good |8 patch |9 pretti |3 sure |4 (4 rows) many thanks, hans -- Cybertec Schoenig Schoenig GmbH Reyergasse 9 / 2 A-2700 Wiener Neustadt Web: www.postgresql-support.de -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tsvector extraction patch
Hans-Juergen Schoenig -- PostgreSQL wrote: hello, this patch has not made it through yesterday, so i am trying to send it again. i made a small patch which i found useful for my personal tasks. it would be nice to see this in 8.5. if not core then maybe contrib. it transforms a tsvector to table format which is really nice for text processing and comparison. test=# SELECT * FROM tsvcontent(to_tsvector('english', 'i am pretty sure this is a good patch')); lex | rank +-- good |8 patch |9 pretti |3 sure |4 (4 rows) many thanks, hans -- Cybertec Schoenig Schoenig GmbH Reyergasse 9 / 2 A-2700 Wiener Neustadt Web: www.postgresql-support.de diff -dcrpN postgresql-8.4.0.old/contrib/Makefile postgresql-8.4.0/contrib/Makefile *** postgresql-8.4.0.old/contrib/Makefile 2009-03-26 00:20:01.0 +0100 --- postgresql-8.4.0/contrib/Makefile 2009-06-29 11:03:04.0 +0200 *** WANTED_DIRS = \ *** 39,44 --- 39,45 tablefunc \ test_parser \ tsearch2 \ + tsvcontent \ vacuumlo ifeq ($(with_openssl),yes) diff -dcrpN postgresql-8.4.0.old/contrib/tsvcontent/Makefile postgresql-8.4.0/contrib/tsvcontent/Makefile *** postgresql-8.4.0.old/contrib/tsvcontent/Makefile 1970-01-01 01:00:00.0 +0100 --- postgresql-8.4.0/contrib/tsvcontent/Makefile 2009-06-29 11:20:21.0 +0200 *** *** 0 --- 1,19 + # $PostgreSQL: pgsql/contrib/tablefunc/Makefile,v 1.9 2007/11/10 23:59:51 momjian Exp $ + + MODULES = tsvcontent + DATA_built = tsvcontent.sql + DATA = uninstall_tsvcontent.sql + + + SHLIB_LINK += $(filter -lm, $(LIBS)) + + ifdef USE_PGXS + PG_CONFIG = pg_config + PGXS := $(shell $(PG_CONFIG) --pgxs) + include $(PGXS) + else + subdir = contrib/tsvcontent + top_builddir = ../.. + include $(top_builddir)/src/Makefile.global + include $(top_srcdir)/contrib/contrib-global.mk + endif diff -dcrpN postgresql-8.4.0.old/contrib/tsvcontent/tsvcontent.c postgresql-8.4.0/contrib/tsvcontent/tsvcontent.c *** postgresql-8.4.0.old/contrib/tsvcontent/tsvcontent.c 1970-01-01 01:00:00.0 +0100 --- postgresql-8.4.0/contrib/tsvcontent/tsvcontent.c 2009-06-29 11:18:35.0 +0200 *** *** 0 --- 1,169 + #include postgres.h + + #include fmgr.h + #include funcapi.h + #include miscadmin.h + #include executor/spi.h + #include lib/stringinfo.h + #include nodes/nodes.h + #include utils/builtins.h + #include utils/lsyscache.h + #include utils/syscache.h + #include utils/memutils.h + #include tsearch/ts_type.h + #include tsearch/ts_utils.h + #include catalog/pg_type.h + + #include tsvcontent.h + + PG_MODULE_MAGIC; + + PG_FUNCTION_INFO_V1(tsvcontent); + + Datum + tsvcontent(PG_FUNCTION_ARGS) + { + FuncCallContext *funcctx; + TupleDesc ret_tupdesc; + AttInMetadata *attinmeta; + int call_cntr; + int max_calls; + ts_to_txt_fctx *fctx; + Datum result[2]; + bool isnull[2] = { false, false }; + MemoryContext oldcontext; + + /* input value containing the TS vector */ + TSVector in = PG_GETARG_TSVECTOR(0); + + /* stuff done only on the first call of the function */ + if (SRF_IS_FIRSTCALL()) + { + TupleDesc tupdesc; + int i, j; + char *wepv_base; + + /* create a function context for cross-call persistence */ + funcctx = SRF_FIRSTCALL_INIT(); + + /* + * switch to memory context appropriate for multiple function calls + */ + oldcontext = MemoryContextSwitchTo(funcctx-multi_call_memory_ctx); + + switch (get_call_result_type(fcinfo, NULL, tupdesc)) + { + case TYPEFUNC_COMPOSITE: + /* success */ + break; + case TYPEFUNC_RECORD: + /* failed to determine actual type of RECORD */ + ereport(ERROR, + (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), + errmsg(function returning record called in context + that cannot accept type record))); + break; + default: + /* result type isn't composite */ + elog(ERROR, return type must be a row type); + break; + } + + /* make sure we have a persistent copy of the tupdesc */ + tupdesc = CreateTupleDescCopy(tupdesc); + + /* + * Generate attribute metadata needed later to produce tuples from raw + * C strings + */ + attinmeta = TupleDescGetAttInMetadata(tupdesc); + funcctx-attinmeta = attinmeta; + + /* allocate memory */ + fctx = (ts_to_txt_fctx *) palloc(sizeof(ts_to_txt_fctx)); + + wepv_base = (char *)in + offsetof(TSVectorData, entries) + in-size * sizeof(WordEntry); + + fctx-n_tsvt = 0; + for (i = 0; i in-size; i++) + { + if (in-entries[i].haspos) + { + WordEntryPosVector *wepv = (WordEntryPosVector *) + (wepv_base + in-entries[i].pos + SHORTALIGN(in-entries[i].len)); + + fctx-n_tsvt += wepv-npos; + } + else + fctx-n_tsvt++; + } + + fctx-tsvt = palloc(fctx-n_tsvt * sizeof(tsvec_tuple)); + + for (i = 0, j = 0; i in-size; i++) + { + int pos = in-entries[i].pos; +
[HACKERS] tsvector extraction patch
hello, i made a small patch which i found useful for my personal tasks. it would be nice to see this in 8.5. if not core then maybe contrib. it transforms a tsvector to table format which is really nice for text processing and comparison. test=# SELECT * FROM tsvcontent(to_tsvector('english', 'i am pretty sure this is a good patch')); lex | rank +-- good |8 patch |9 pretti |3 sure |4 (4 rows) many thanks, hans -- Cybertec Schoenig Schoenig GmbH Reyergasse 9 / 2 A-2700 Wiener Neustadt Web: www.postgresql-support.de diff -dcrpN postgresql-8.4.0.old/contrib/Makefile postgresql-8.4.0/contrib/Makefile *** postgresql-8.4.0.old/contrib/Makefile 2009-03-26 00:20:01.0 +0100 --- postgresql-8.4.0/contrib/Makefile 2009-06-29 11:03:04.0 +0200 *** WANTED_DIRS = \ *** 39,44 --- 39,45 tablefunc \ test_parser \ tsearch2 \ + tsvcontent \ vacuumlo ifeq ($(with_openssl),yes) diff -dcrpN postgresql-8.4.0.old/contrib/tsvcontent/Makefile postgresql-8.4.0/contrib/tsvcontent/Makefile *** postgresql-8.4.0.old/contrib/tsvcontent/Makefile 1970-01-01 01:00:00.0 +0100 --- postgresql-8.4.0/contrib/tsvcontent/Makefile 2009-06-29 11:20:21.0 +0200 *** *** 0 --- 1,19 + # $PostgreSQL: pgsql/contrib/tablefunc/Makefile,v 1.9 2007/11/10 23:59:51 momjian Exp $ + + MODULES = tsvcontent + DATA_built = tsvcontent.sql + DATA = uninstall_tsvcontent.sql + + + SHLIB_LINK += $(filter -lm, $(LIBS)) + + ifdef USE_PGXS + PG_CONFIG = pg_config + PGXS := $(shell $(PG_CONFIG) --pgxs) + include $(PGXS) + else + subdir = contrib/tsvcontent + top_builddir = ../.. + include $(top_builddir)/src/Makefile.global + include $(top_srcdir)/contrib/contrib-global.mk + endif diff -dcrpN postgresql-8.4.0.old/contrib/tsvcontent/tsvcontent.c postgresql-8.4.0/contrib/tsvcontent/tsvcontent.c *** postgresql-8.4.0.old/contrib/tsvcontent/tsvcontent.c 1970-01-01 01:00:00.0 +0100 --- postgresql-8.4.0/contrib/tsvcontent/tsvcontent.c 2009-06-29 11:18:35.0 +0200 *** *** 0 --- 1,169 + #include postgres.h + + #include fmgr.h + #include funcapi.h + #include miscadmin.h + #include executor/spi.h + #include lib/stringinfo.h + #include nodes/nodes.h + #include utils/builtins.h + #include utils/lsyscache.h + #include utils/syscache.h + #include utils/memutils.h + #include tsearch/ts_type.h + #include tsearch/ts_utils.h + #include catalog/pg_type.h + + #include tsvcontent.h + + PG_MODULE_MAGIC; + + PG_FUNCTION_INFO_V1(tsvcontent); + + Datum + tsvcontent(PG_FUNCTION_ARGS) + { + FuncCallContext *funcctx; + TupleDesc ret_tupdesc; + AttInMetadata *attinmeta; + int call_cntr; + int max_calls; + ts_to_txt_fctx *fctx; + Datum result[2]; + bool isnull[2] = { false, false }; + MemoryContext oldcontext; + + /* input value containing the TS vector */ + TSVector in = PG_GETARG_TSVECTOR(0); + + /* stuff done only on the first call of the function */ + if (SRF_IS_FIRSTCALL()) + { + TupleDesc tupdesc; + int i, j; + char *wepv_base; + + /* create a function context for cross-call persistence */ + funcctx = SRF_FIRSTCALL_INIT(); + + /* + * switch to memory context appropriate for multiple function calls + */ + oldcontext = MemoryContextSwitchTo(funcctx-multi_call_memory_ctx); + + switch (get_call_result_type(fcinfo, NULL, tupdesc)) + { + case TYPEFUNC_COMPOSITE: + /* success */ + break; + case TYPEFUNC_RECORD: + /* failed to determine actual type of RECORD */ + ereport(ERROR, + (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), + errmsg(function returning record called in context + that cannot accept type record))); + break; + default: + /* result type isn't composite */ + elog(ERROR, return type must be a row type); + break; + } + + /* make sure we have a persistent copy of the tupdesc */ + tupdesc = CreateTupleDescCopy(tupdesc); + + /* + * Generate attribute metadata needed later to produce tuples from raw + * C strings + */ + attinmeta = TupleDescGetAttInMetadata(tupdesc); + funcctx-attinmeta = attinmeta; + + /* allocate memory */ + fctx = (ts_to_txt_fctx *) palloc(sizeof(ts_to_txt_fctx)); + + wepv_base = (char *)in + offsetof(TSVectorData, entries) + in-size * sizeof(WordEntry); + + fctx-n_tsvt = 0; + for (i = 0; i in-size; i++) + { + if (in-entries[i].haspos) + { + WordEntryPosVector *wepv = (WordEntryPosVector *) + (wepv_base + in-entries[i].pos + SHORTALIGN(in-entries[i].len)); + + fctx-n_tsvt += wepv-npos; + } + else + fctx-n_tsvt++; + } + + fctx-tsvt = palloc(fctx-n_tsvt * sizeof(tsvec_tuple)); + + for (i = 0, j = 0; i in-size; i++) + { + int pos = in-entries[i].pos; + int len = in-entries[i].len; + + if (in-entries[i].haspos) + { + WordEntryPosVector *wepv = (WordEntryPosVector *) +
Re: [HACKERS] tsvector extraction patch
On Thu, Jul 2, 2009 at 10:13 AM, Hans-Juergen Schoenig -- PostgreSQLpostg...@cybertec.at wrote: hello, i made a small patch which i found useful for my personal tasks. it would be nice to see this in 8.5. if not core then maybe contrib. it transforms a tsvector to table format which is really nice for text processing and comparison. test=# SELECT * FROM tsvcontent(to_tsvector('english', 'i am pretty sure this is a good patch')); lex | rank +-- good | 8 patch | 9 pretti | 3 sure | 4 (4 rows) many thanks, hans If you'd like this reviewed for the next CommitFest, please add it to the wiki here: http://wiki.postgresql.org/wiki/CommitFestOpen ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tsvector extraction patch
I have simple solution http://www.sai.msu.su/~megera/wiki/2008-12-17 On Thu, 2 Jul 2009, Hans-Juergen Schoenig -- PostgreSQL wrote: hello, i made a small patch which i found useful for my personal tasks. it would be nice to see this in 8.5. if not core then maybe contrib. it transforms a tsvector to table format which is really nice for text processing and comparison. test=# SELECT * FROM tsvcontent(to_tsvector('english', 'i am pretty sure this is a good patch')); lex | rank +-- good |8 patch |9 pretti |3 sure |4 (4 rows) many thanks, hans Regards, Oleg _ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: o...@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers