Re: [HACKERS] tsvector extraction patch

2010-01-07 Thread Robert Haas
On Fri, Jul 3, 2009 at 3:01 AM, Hans-Juergen Schoenig -- PostgreSQL
postg...@cybertec.at wrote:
 Hans-Juergen Schoenig -- PostgreSQL wrote:

 hello,

 this patch has not made it through yesterday, so i am trying to send it
 again.
 i made a small patch which i found useful for my personal tasks.
 it would be nice to see this in 8.5. if not core then maybe contrib.
 it transforms a tsvector to table format which is really nice for text
 processing and comparison.

 test=# SELECT * FROM tsvcontent(to_tsvector('english', 'i am pretty sure
 this is a good patch'));
 lex   | rank
 +--
 good   |    8
 patch  |    9
 pretti |    3
 sure   |    4
 (4 rows)

  many thanks,

     hans

Hmm, looks like we never did anything about this.  Hans-Juergen, you
should probably update this and add it to the open CommitFest if you
want it to be considered for 8.5.

https://commitfest.postgresql.org/action/commitfest_view/open

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tsvector extraction patch

2009-07-08 Thread Alvaro Herrera
Mike Rylander escribió:
 On Fri, Jul 3, 2009 at 3:49 AM, Hans-Juergen Schoenig --
 PostgreSQLpostg...@cybertec.at wrote:

  test=# SELECT * FROM tsvcontent(to_tsvector('english', 'i am pretty sure
  this is a good patch'));
  lex   | rank
  +--
  good   |    8
  patch  |    9
  pretti |    3
  sure   |    4
  (4 rows)
 
 
 This looks very useful!  I wonder if providing a weight column would
 be relatively simple?  I think this would present problems with the
 cast-to-text[] idea that Peter suggests, though.

Where would the weight come from?

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Fwd: [HACKERS] tsvector extraction patch

2009-07-08 Thread Mike Rylander
Sorry, forgot to reply-all.


-- Forwarded message --
From: Mike Rylander mrylan...@gmail.com
Date: Wed, Jul 8, 2009 at 4:17 PM
Subject: Re: [HACKERS] tsvector extraction patch
To: Alvaro Herrera alvhe...@commandprompt.com


On Wed, Jul 8, 2009 at 3:38 PM, Alvaro
Herreraalvhe...@commandprompt.com wrote:
 Mike Rylander escribió:
 On Fri, Jul 3, 2009 at 3:49 AM, Hans-Juergen Schoenig --
 PostgreSQLpostg...@cybertec.at wrote:

  test=# SELECT * FROM tsvcontent(to_tsvector('english', 'i am pretty sure
  this is a good patch'));
  lex   | rank
  +--
  good   |    8
  patch  |    9
  pretti |    3
  sure   |    4
  (4 rows)
 

 This looks very useful!  I wonder if providing a weight column would
 be relatively simple?  I think this would present problems with the
 cast-to-text[] idea that Peter suggests, though.

 Where would the weight come from?


From a tsvector column that has weights set via setweight().

--
Mike Rylander
 | VP, Research and Design
 | Equinox Software, Inc. / The Evergreen Experts
 | phone:  1-877-OPEN-ILS (673-6457)
 | email:  mi...@esilibrary.com
 | web:  http://www.esilibrary.com



-- 
Mike Rylander
 | VP, Research and Design
 | Equinox Software, Inc. / The Evergreen Experts
 | phone:  1-877-OPEN-ILS (673-6457)
 | email:  mi...@esilibrary.com
 | web:  http://www.esilibrary.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tsvector extraction patch

2009-07-06 Thread Peter Eisentraut
On Friday 03 July 2009 10:49:41 Hans-Juergen Schoenig -- PostgreSQL wrote:
 hello,

 this patch has not made it through yesterday, so i am trying to send it
 again.
 i made a small patch which i found useful for my personal tasks.
 it would be nice to see this in 8.5. if not core then maybe contrib.
 it transforms a tsvector to table format which is really nice for text
 processing and comparison.

 test=# SELECT * FROM tsvcontent(to_tsvector('english', 'i am pretty sure
 this is a good patch'));
  lex   | rank
 +--
 good   |8
 patch  |9
 pretti |3
 sure   |4
 (4 rows)

Sounds useful.  But in the interest of orthogonality (or whatever), how about 
instead you write a cast from tsvector to text[], and then you can use 
unnest() to convert that to a table, e.g.,

SELECT * FROM unnest(CAST(to_tsvector('...') AS text[]));


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tsvector extraction patch

2009-07-06 Thread Mike Rylander
On Fri, Jul 3, 2009 at 3:49 AM, Hans-Juergen Schoenig --
PostgreSQLpostg...@cybertec.at wrote:
 hello,

 this patch has not made it through yesterday, so i am trying to send it
 again.
 i made a small patch which i found useful for my personal tasks.
 it would be nice to see this in 8.5. if not core then maybe contrib.
 it transforms a tsvector to table format which is really nice for text
 processing and comparison.

 test=# SELECT * FROM tsvcontent(to_tsvector('english', 'i am pretty sure
 this is a good patch'));
 lex   | rank
 +--
 good   |    8
 patch  |    9
 pretti |    3
 sure   |    4
 (4 rows)


This looks very useful!  I wonder if providing a weight column would
be relatively simple?  I think this would present problems with the
cast-to-text[] idea that Peter suggests, though.

-- 
Mike Rylander
 | VP, Research and Design
 | Equinox Software, Inc. / The Evergreen Experts
 | phone:  1-877-OPEN-ILS (673-6457)
 | email:  mi...@esilibrary.com
 | web:  http://www.esilibrary.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] tsvector extraction patch

2009-07-03 Thread Hans-Juergen Schoenig -- PostgreSQL

hello,

this patch has not made it through yesterday, so i am trying to send it 
again.

i made a small patch which i found useful for my personal tasks.
it would be nice to see this in 8.5. if not core then maybe contrib.
it transforms a tsvector to table format which is really nice for text 
processing and comparison.


test=# SELECT * FROM tsvcontent(to_tsvector('english', 'i am pretty sure 
this is a good patch'));

lex   | rank
+--
good   |8
patch  |9
pretti |3
sure   |4
(4 rows)

  many thanks,

 hans

--
Cybertec Schoenig  Schoenig GmbH
Reyergasse 9 / 2
A-2700 Wiener Neustadt
Web: www.postgresql-support.de


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tsvector extraction patch

2009-07-03 Thread Hans-Juergen Schoenig -- PostgreSQL

Hans-Juergen Schoenig -- PostgreSQL wrote:

hello,

this patch has not made it through yesterday, so i am trying to send 
it again.

i made a small patch which i found useful for my personal tasks.
it would be nice to see this in 8.5. if not core then maybe contrib.
it transforms a tsvector to table format which is really nice for text 
processing and comparison.


test=# SELECT * FROM tsvcontent(to_tsvector('english', 'i am pretty 
sure this is a good patch'));

lex   | rank
+--
good   |8
patch  |9
pretti |3
sure   |4
(4 rows)

  many thanks,

 hans




--
Cybertec Schoenig  Schoenig GmbH
Reyergasse 9 / 2
A-2700 Wiener Neustadt
Web: www.postgresql-support.de

diff -dcrpN postgresql-8.4.0.old/contrib/Makefile postgresql-8.4.0/contrib/Makefile
*** postgresql-8.4.0.old/contrib/Makefile	2009-03-26 00:20:01.0 +0100
--- postgresql-8.4.0/contrib/Makefile	2009-06-29 11:03:04.0 +0200
*** WANTED_DIRS = \
*** 39,44 
--- 39,45 
  		tablefunc	\
  		test_parser	\
  		tsearch2	\
+ 		tsvcontent	\
  		vacuumlo
  
  ifeq ($(with_openssl),yes)
diff -dcrpN postgresql-8.4.0.old/contrib/tsvcontent/Makefile postgresql-8.4.0/contrib/tsvcontent/Makefile
*** postgresql-8.4.0.old/contrib/tsvcontent/Makefile	1970-01-01 01:00:00.0 +0100
--- postgresql-8.4.0/contrib/tsvcontent/Makefile	2009-06-29 11:20:21.0 +0200
***
*** 0 
--- 1,19 
+ # $PostgreSQL: pgsql/contrib/tablefunc/Makefile,v 1.9 2007/11/10 23:59:51 momjian Exp $
+ 
+ MODULES = tsvcontent
+ DATA_built = tsvcontent.sql
+ DATA = uninstall_tsvcontent.sql
+ 
+ 
+ SHLIB_LINK += $(filter -lm, $(LIBS))
+ 
+ ifdef USE_PGXS
+ PG_CONFIG = pg_config
+ PGXS := $(shell $(PG_CONFIG) --pgxs)
+ include $(PGXS)
+ else
+ subdir = contrib/tsvcontent
+ top_builddir = ../..
+ include $(top_builddir)/src/Makefile.global
+ include $(top_srcdir)/contrib/contrib-global.mk
+ endif
diff -dcrpN postgresql-8.4.0.old/contrib/tsvcontent/tsvcontent.c postgresql-8.4.0/contrib/tsvcontent/tsvcontent.c
*** postgresql-8.4.0.old/contrib/tsvcontent/tsvcontent.c	1970-01-01 01:00:00.0 +0100
--- postgresql-8.4.0/contrib/tsvcontent/tsvcontent.c	2009-06-29 11:18:35.0 +0200
***
*** 0 
--- 1,169 
+ #include postgres.h
+ 
+ #include fmgr.h
+ #include funcapi.h
+ #include miscadmin.h
+ #include executor/spi.h
+ #include lib/stringinfo.h
+ #include nodes/nodes.h
+ #include utils/builtins.h
+ #include utils/lsyscache.h
+ #include utils/syscache.h
+ #include utils/memutils.h
+ #include tsearch/ts_type.h
+ #include tsearch/ts_utils.h
+ #include catalog/pg_type.h
+ 
+ #include tsvcontent.h
+ 
+ PG_MODULE_MAGIC;
+ 
+ PG_FUNCTION_INFO_V1(tsvcontent);
+ 
+ Datum
+ tsvcontent(PG_FUNCTION_ARGS)
+ {
+ 	FuncCallContext 	*funcctx;
+ 	TupleDesc		ret_tupdesc;
+ 	AttInMetadata		*attinmeta;
+ 	int			call_cntr;
+ 	int			max_calls;
+ 	ts_to_txt_fctx		*fctx;
+ 	Datum			result[2];
+ 	bool			isnull[2] = { false, false };
+ 	MemoryContext 		oldcontext;
+ 
+ 	/* input value containing the TS vector */
+ 	TSVector	in = PG_GETARG_TSVECTOR(0);
+ 
+ 	/* stuff done only on the first call of the function */
+ 	if (SRF_IS_FIRSTCALL())
+ 	{
+ 		TupleDesc	tupdesc;
+ 		int		i, j;
+ 		char		*wepv_base;
+ 
+ 		/* create a function context for cross-call persistence */
+ 		funcctx = SRF_FIRSTCALL_INIT();
+ 
+ 		/*
+ 		 * switch to memory context appropriate for multiple function calls
+ 		 */
+ 		oldcontext = MemoryContextSwitchTo(funcctx-multi_call_memory_ctx);
+ 
+ 		switch (get_call_result_type(fcinfo, NULL, tupdesc))
+ 		{
+ 			case TYPEFUNC_COMPOSITE:
+ /* success */
+ break;
+ 			case TYPEFUNC_RECORD:
+ /* failed to determine actual type of RECORD */
+ ereport(ERROR,
+ 		(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ 		errmsg(function returning record called in context 
+ that cannot accept type record)));
+ break;
+ 			default:
+ /* result type isn't composite */
+ elog(ERROR, return type must be a row type);
+ break;
+ 		}
+ 
+ 		/* make sure we have a persistent copy of the tupdesc */
+ 		tupdesc = CreateTupleDescCopy(tupdesc);
+ 
+ 		/*
+ 		 * Generate attribute metadata needed later to produce tuples from raw
+ 		 * C strings
+ 		 */
+ 		attinmeta = TupleDescGetAttInMetadata(tupdesc);
+ 		funcctx-attinmeta = attinmeta;
+ 
+ 		/* allocate memory */
+ 		fctx = (ts_to_txt_fctx *) palloc(sizeof(ts_to_txt_fctx));
+ 
+ 		wepv_base = (char *)in + offsetof(TSVectorData, entries) + in-size * sizeof(WordEntry);
+ 		
+ 		fctx-n_tsvt = 0;
+ 		for (i = 0; i  in-size; i++)
+ 		{
+ 			if (in-entries[i].haspos)
+ 			{
+ WordEntryPosVector *wepv = (WordEntryPosVector *)
+ (wepv_base + in-entries[i].pos + SHORTALIGN(in-entries[i].len));
+ 
+ fctx-n_tsvt += wepv-npos;
+ 			}
+ 			else
+ fctx-n_tsvt++;
+ 		}
+ 
+ 		fctx-tsvt = palloc(fctx-n_tsvt * sizeof(tsvec_tuple));
+ 
+ 		for (i = 0, j = 0; i  in-size; i++)
+ 		{
+ 			int pos = in-entries[i].pos;
+ 			

[HACKERS] tsvector extraction patch

2009-07-02 Thread Hans-Juergen Schoenig -- PostgreSQL

hello,

i made a small patch which i found useful for my personal tasks.
it would be nice to see this in 8.5. if not core then maybe contrib.
it transforms a tsvector to table format which is really nice for text 
processing and comparison.


test=# SELECT * FROM tsvcontent(to_tsvector('english', 'i am pretty sure 
this is a good patch'));

 lex   | rank
+--
good   |8
patch  |9
pretti |3
sure   |4
(4 rows)

   many thanks,

  hans

--
Cybertec Schoenig  Schoenig GmbH
Reyergasse 9 / 2
A-2700 Wiener Neustadt
Web: www.postgresql-support.de

diff -dcrpN postgresql-8.4.0.old/contrib/Makefile postgresql-8.4.0/contrib/Makefile
*** postgresql-8.4.0.old/contrib/Makefile	2009-03-26 00:20:01.0 +0100
--- postgresql-8.4.0/contrib/Makefile	2009-06-29 11:03:04.0 +0200
*** WANTED_DIRS = \
*** 39,44 
--- 39,45 
  		tablefunc	\
  		test_parser	\
  		tsearch2	\
+ 		tsvcontent	\
  		vacuumlo
  
  ifeq ($(with_openssl),yes)
diff -dcrpN postgresql-8.4.0.old/contrib/tsvcontent/Makefile postgresql-8.4.0/contrib/tsvcontent/Makefile
*** postgresql-8.4.0.old/contrib/tsvcontent/Makefile	1970-01-01 01:00:00.0 +0100
--- postgresql-8.4.0/contrib/tsvcontent/Makefile	2009-06-29 11:20:21.0 +0200
***
*** 0 
--- 1,19 
+ # $PostgreSQL: pgsql/contrib/tablefunc/Makefile,v 1.9 2007/11/10 23:59:51 momjian Exp $
+ 
+ MODULES = tsvcontent
+ DATA_built = tsvcontent.sql
+ DATA = uninstall_tsvcontent.sql
+ 
+ 
+ SHLIB_LINK += $(filter -lm, $(LIBS))
+ 
+ ifdef USE_PGXS
+ PG_CONFIG = pg_config
+ PGXS := $(shell $(PG_CONFIG) --pgxs)
+ include $(PGXS)
+ else
+ subdir = contrib/tsvcontent
+ top_builddir = ../..
+ include $(top_builddir)/src/Makefile.global
+ include $(top_srcdir)/contrib/contrib-global.mk
+ endif
diff -dcrpN postgresql-8.4.0.old/contrib/tsvcontent/tsvcontent.c postgresql-8.4.0/contrib/tsvcontent/tsvcontent.c
*** postgresql-8.4.0.old/contrib/tsvcontent/tsvcontent.c	1970-01-01 01:00:00.0 +0100
--- postgresql-8.4.0/contrib/tsvcontent/tsvcontent.c	2009-06-29 11:18:35.0 +0200
***
*** 0 
--- 1,169 
+ #include postgres.h
+ 
+ #include fmgr.h
+ #include funcapi.h
+ #include miscadmin.h
+ #include executor/spi.h
+ #include lib/stringinfo.h
+ #include nodes/nodes.h
+ #include utils/builtins.h
+ #include utils/lsyscache.h
+ #include utils/syscache.h
+ #include utils/memutils.h
+ #include tsearch/ts_type.h
+ #include tsearch/ts_utils.h
+ #include catalog/pg_type.h
+ 
+ #include tsvcontent.h
+ 
+ PG_MODULE_MAGIC;
+ 
+ PG_FUNCTION_INFO_V1(tsvcontent);
+ 
+ Datum
+ tsvcontent(PG_FUNCTION_ARGS)
+ {
+ 	FuncCallContext 	*funcctx;
+ 	TupleDesc		ret_tupdesc;
+ 	AttInMetadata		*attinmeta;
+ 	int			call_cntr;
+ 	int			max_calls;
+ 	ts_to_txt_fctx		*fctx;
+ 	Datum			result[2];
+ 	bool			isnull[2] = { false, false };
+ 	MemoryContext 		oldcontext;
+ 
+ 	/* input value containing the TS vector */
+ 	TSVector	in = PG_GETARG_TSVECTOR(0);
+ 
+ 	/* stuff done only on the first call of the function */
+ 	if (SRF_IS_FIRSTCALL())
+ 	{
+ 		TupleDesc	tupdesc;
+ 		int		i, j;
+ 		char		*wepv_base;
+ 
+ 		/* create a function context for cross-call persistence */
+ 		funcctx = SRF_FIRSTCALL_INIT();
+ 
+ 		/*
+ 		 * switch to memory context appropriate for multiple function calls
+ 		 */
+ 		oldcontext = MemoryContextSwitchTo(funcctx-multi_call_memory_ctx);
+ 
+ 		switch (get_call_result_type(fcinfo, NULL, tupdesc))
+ 		{
+ 			case TYPEFUNC_COMPOSITE:
+ /* success */
+ break;
+ 			case TYPEFUNC_RECORD:
+ /* failed to determine actual type of RECORD */
+ ereport(ERROR,
+ 		(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ 		errmsg(function returning record called in context 
+ that cannot accept type record)));
+ break;
+ 			default:
+ /* result type isn't composite */
+ elog(ERROR, return type must be a row type);
+ break;
+ 		}
+ 
+ 		/* make sure we have a persistent copy of the tupdesc */
+ 		tupdesc = CreateTupleDescCopy(tupdesc);
+ 
+ 		/*
+ 		 * Generate attribute metadata needed later to produce tuples from raw
+ 		 * C strings
+ 		 */
+ 		attinmeta = TupleDescGetAttInMetadata(tupdesc);
+ 		funcctx-attinmeta = attinmeta;
+ 
+ 		/* allocate memory */
+ 		fctx = (ts_to_txt_fctx *) palloc(sizeof(ts_to_txt_fctx));
+ 
+ 		wepv_base = (char *)in + offsetof(TSVectorData, entries) + in-size * sizeof(WordEntry);
+ 		
+ 		fctx-n_tsvt = 0;
+ 		for (i = 0; i  in-size; i++)
+ 		{
+ 			if (in-entries[i].haspos)
+ 			{
+ WordEntryPosVector *wepv = (WordEntryPosVector *)
+ (wepv_base + in-entries[i].pos + SHORTALIGN(in-entries[i].len));
+ 
+ fctx-n_tsvt += wepv-npos;
+ 			}
+ 			else
+ fctx-n_tsvt++;
+ 		}
+ 
+ 		fctx-tsvt = palloc(fctx-n_tsvt * sizeof(tsvec_tuple));
+ 
+ 		for (i = 0, j = 0; i  in-size; i++)
+ 		{
+ 			int pos = in-entries[i].pos;
+ 			int len = in-entries[i].len;
+ 
+ 			if (in-entries[i].haspos)
+ 			{
+ WordEntryPosVector *wepv = (WordEntryPosVector *)
+ 

Re: [HACKERS] tsvector extraction patch

2009-07-02 Thread Robert Haas
On Thu, Jul 2, 2009 at 10:13 AM, Hans-Juergen Schoenig --
PostgreSQLpostg...@cybertec.at wrote:
 hello,

 i made a small patch which i found useful for my personal tasks.
 it would be nice to see this in 8.5. if not core then maybe contrib.
 it transforms a tsvector to table format which is really nice for text
 processing and comparison.

 test=# SELECT * FROM tsvcontent(to_tsvector('english', 'i am pretty sure
 this is a good patch'));
  lex   | rank
 +--
 good   |    8
 patch  |    9
 pretti |    3
 sure   |    4
 (4 rows)

   many thanks,

      hans

If you'd like this reviewed for the next CommitFest, please add it to
the wiki here:

http://wiki.postgresql.org/wiki/CommitFestOpen

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tsvector extraction patch

2009-07-02 Thread Oleg Bartunov
I have simple solution 
http://www.sai.msu.su/~megera/wiki/2008-12-17



On Thu, 2 Jul 2009, Hans-Juergen Schoenig -- PostgreSQL wrote:


hello,

i made a small patch which i found useful for my personal tasks.
it would be nice to see this in 8.5. if not core then maybe contrib.
it transforms a tsvector to table format which is really nice for text 
processing and comparison.


test=# SELECT * FROM tsvcontent(to_tsvector('english', 'i am pretty sure this 
is a good patch'));

lex   | rank
+--
good   |8
patch  |9
pretti |3
sure   |4
(4 rows)

  many thanks,

 hans




Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: o...@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers