Re: [PATCHES] Bunch of tsearch fixes and cleanup

2007-08-24 Thread Heikki Linnakangas
Heikki Linnakangas wrote: Tom Lane wrote: Something that was annoying me yesterday was that it was not clear whether we had fixed every single place that uses a tsearch config file to assume that the file is in UTF8 and should be converted to database encoding. So I was thinking of

Re: [PATCHES] Bunch of tsearch fixes and cleanup

2007-08-24 Thread Heikki Linnakangas
And here's the attachment I forgot. Heikki Linnakangas wrote: Heikki Linnakangas wrote: Tom Lane wrote: Something that was annoying me yesterday was that it was not clear whether we had fixed every single place that uses a tsearch config file to assume that the file is in UTF8 and should be

Re: [PATCHES] Bunch of tsearch fixes and cleanup

2007-08-24 Thread Tom Lane
Heikki Linnakangas [EMAIL PROTECTED] writes: Ok, here's an updated version of the patch. I haven't actually read this patch yet, but the description all sounds like the Right Thing now. Will review and commit today. Also, I believe there's consensus to rename the standard Snowball dictionaries

Re: [PATCHES] Bunch of tsearch fixes and cleanup

2007-08-24 Thread Tom Lane
Heikki Linnakangas [EMAIL PROTECTED] writes: Ok, here's an updated version of the patch. Applied, with a few trivial additional cleanups I noticed while reading the patch. I included your HeadlineText de-duplication too. regards, tom lane

[PATCHES] Bunch of tsearch fixes and cleanup

2007-08-23 Thread Heikki Linnakangas
Fixes the following bugs: - ispell initialization crashed on empty dictionary file - ispell initialization crashed on affix file with prefixes but no suffixes - stop words file was ran through pg_verify_mbstr, with database encoding, but it's later interpreted as being UTF-8. Now verifies that

Re: [PATCHES] Bunch of tsearch fixes and cleanup

2007-08-23 Thread Tom Lane
Heikki Linnakangas [EMAIL PROTECTED] writes: - readstopwords calls recode_and_lowerstr directly, instead of using the wordop function pointer in StopList struct. All callers used recode_and_lowerstr anyway, so this simplifies the code a little bit. Is there any external dictionary

Re: [PATCHES] Bunch of tsearch fixes and cleanup

2007-08-23 Thread Heikki Linnakangas
Tom Lane wrote: Heikki Linnakangas [EMAIL PROTECTED] writes: - readstopwords calls recode_and_lowerstr directly, instead of using the wordop function pointer in StopList struct. All callers used recode_and_lowerstr anyway, so this simplifies the code a little bit. Is there any external

Re: [PATCHES] Bunch of tsearch fixes and cleanup

2007-08-23 Thread Oleg Bartunov
On Thu, 23 Aug 2007, Tom Lane wrote: Heikki Linnakangas [EMAIL PROTECTED] writes: - readstopwords calls recode_and_lowerstr directly, instead of using the wordop function pointer in StopList struct. All callers used recode_and_lowerstr anyway, so this simplifies the code a little bit. Is

Re: [PATCHES] Bunch of tsearch fixes and cleanup

2007-08-23 Thread Heikki Linnakangas
Tom Lane wrote: Something that was annoying me yesterday was that it was not clear whether we had fixed every single place that uses a tsearch config file to assume that the file is in UTF8 and should be converted to database encoding. So I was thinking of hardwiring the recode part into