> dictionaries. In this case, you would first check against one stopword > list, eliminating 'od', then check the ispell dictionary, and then check > another stopword list without 'od'.
My problem is basically solved using the patch I sent earlier. I use '{stop, pl_ispell, simple}' which has the effect of: a) eliminating words that are stopwords but stemmed produce non-stopwords (such as 'od', that gets stemmed to 'oda') b) stemming non-stopwords properly (using an ispell dictionary) c) indexing words that are not reckognized by ispell, (for instance 'postgresql' gets indexed as 'postgresql') > I suggested that a while ago > (http://archives.postgresql.org/pgsql-hackers/2007-08/msg01036.php). > Hopefully Oleg or someone else gets around restructuring the > dictionaries in a future release. I'm gald to see I'm not the only one who is in need of a more sophisticated way of dealing with dictionaries chaining. I understand however the problems that arise when one wants to extend the dictionary API beyond the reject/accept/pass-on schema. For these three we have an easy way of passing the result from lexize - it returns an empty array, an array of stemmed lexemes or NULL. If more complex actions were to be taken, I'm afraid lexize would have to return something more complex than just text[]. > I wonder if you could hack the ispell dictionary file to treat oda > specially? I thought about it, but it turned out that writing a custom dictionary was easier than figuring out how ispell works internally. Regards, -- Jan Urbanski GPG key ID: E583D7D2 ouden estin
signature.asc
Description: OpenPGP digital signature