[PATCHES] a tsearch2 (8.2.4) dictionary that only filters out stopwords

2007-11-08 Thread Jan Urbański
Hi, the rationale for this patch is rather complicated, as it's related to the peculiarities of Polish grammar. Please read on. I'm using PostgreSQL 8.2.4 and the ispell tsearch2 dictionary. The problem is as follows. In Polish (and possibly other languages that don't come to my mind at the momen

Re: [PATCHES] a tsearch2 (8.2.4) dictionary that only filters out stopwords

2007-11-09 Thread Jan Urbański
>> The solution I came up with was simple: write a dictionary, that does >> only one thing: looks up the lexeme in a stopwords file and either >> discards it or returns NULL. > > Doesn't the "simple" dictionary handle this? I don't think so. The 'simple' dictionary discards stopwords, but accepts

Re: [PATCHES] a tsearch2 (8.2.4) dictionary that only filters out stopwords

2007-11-09 Thread Jan Urbański
> dictionaries. In this case, you would first check against one stopword > list, eliminating 'od', then check the ispell dictionary, and then check > another stopword list without 'od'. My problem is basically solved using the patch I sent earlier. I use '{stop, pl_ispell, simple}' which has the e

Re: [PATCHES] a tsearch2 (8.2.4) dictionary that only filters out stopwords

2007-11-09 Thread Jan Urbański
> This example still doesn't seem very convincing --- why would you not > merely attach the stopword list to the pl_ispell dictionary? Because the ispell-based dictionaries first stem the lexeme and then search for it in the stopwords file. The situation here is that a stopword is first stemmed to

Re: [PATCHES] a tsearch2 (8.2.4) dictionary that only filters out stopwords

2007-11-09 Thread Jan Urbański
> That doesn't have a whole lot to do with where we are today: > http://developer.postgresql.org/pgdocs/postgres/textsearch-dictionaries.html#TEXTSEARCH-SIMPLE-DICTIONARY > http://developer.postgresql.org/cvsweb.cgi/pgsql/src/backend/tsearch/dict_simple.c Great, I didn't know the API was that conv

Re: [PATCHES] a tsearch2 (8.2.4) dictionary that only filters out stopwords

2007-11-14 Thread Jan Urbański
Jan Urbański wrote: > Great, I didn't know the API was that convenient in 8.3. I'll try > posting a working patch for 8.3 during the weekend. Here's the patch for 8.3beta2. As was suggested I added a configuration parameter to the 'simple' dictionary called Acce

Re: [PATCHES] a tsearch2 (8.2.4) dictionary that only filters out stopwords

2007-11-14 Thread Jan Urbański
> This bit should be replaced with defGetBoolean. Otherwise it looks > reasonably sane. Fixed that, thank you. Regards, Jan Urbanski -- Jan Urbanski GPG key ID: E583D7D2 ouden estin diff -Naur postgresql-8.3beta2-orig/doc/src/sgml/textsearch.sgml postgresql-8.3beta2/doc/src/sgml/textsearch.sg

[PATCHES] typo in func.sgml

2007-12-20 Thread Jan Urbański
Cheers, Jan Urbanski -- Jan Urbanski GPG key ID: E583D7D2 ouden estin Index: pgsql/doc/src/sgml/func.sgml === RCS file: /projects/cvsroot/pgsql/doc/src/sgml/func.sgml,v retrieving revision 1.418 diff -u -r1.418 func.sgml --- pgsql/do

[PATCHES] extend VacAttrStats to allow stavalues of different types

2008-05-15 Thread Jan Urbański
Following the conclusion here: http://archives.postgresql.org/pgsql-hackers/2008-05/msg00273.php here's a patch that extends VacAttrStats to allow typanalyze functions to store statistic values of different types than the underlying column. The XXX comment can be taken into consideration or jus

Re: [PATCHES] extend VacAttrStats to allow stavalues of different types

2008-05-15 Thread Jan Urbański
Jan Urbański wrote: Following the conclusion here: http://archives.postgresql.org/pgsql-hackers/2008-05/msg00273.php here's a patch that extends VacAttrStats to allow typanalyze functions to store statistic values of different types than the underlying column. The XXX comment can be

Re: [PATCHES] extend VacAttrStats to allow stavalues of different types

2008-06-02 Thread Jan Urbański
Heikki Linnakangas wrote: I tried to google for a user defined data type with a custom typanalyze function but didn't find anything, so I don't think it's an issue. It's a bit nasty, though, because if one exists, it will compile and run just fine, until you run ANALYZE. In general it's better

Re: [PATCHES] extend VacAttrStats to allow stavalues of different types

2008-06-02 Thread Jan Urbański
Tom Lane wrote: I think the correct solution is to initialize the fields to match the column type before calling the typanalyze function. Then you don't break compatibility for existing typanalyze functions. It's also less code, since the standard typanalyze functions can rely on those preset v

[PATCHES] minor ts_type.h comment fix

2008-06-09 Thread Jan Urbański
These should read TSQuery, not TSVector, no? -- Jan Urbanski GPG key ID: E583D7D2 ouden estin *** src/include/tsearch/ts_type.h --- src/include/tsearch/ts_type.h 2008-06-09 23:41:26.0 +0200 *** *** 239,248 */ #define COMPUTESIZE(size, lenofoperand) ( HDRSIZETQ