On Thu, Jul 4, 2019 at 1:39 PM Peter Eisentraut < peter.eisentr...@2ndquadrant.com> wrote:
> On 2019-03-25 12:04, Panagiotis Mavrogiorgos wrote: > > Last November snowball added support for Greek language [1]. Following > > the instructions [2], I wrote a patch that adds fulltext search for > > Greek in Postgres. The patch is attached. > > I have committed a full sync from the upstream snowball repository, > which pulled in the new greek stemmer. > > Could you please clarify where you got the stopword list from? The > README says those need to be downloaded separately, but I wasn't able to > find the download location. It would be good to document this, for > example in the commit message. I haven't committed the stopword list yet. > Thank you Peter, Here is the repo with the stop-words: https://github.com/pmav99/greek_stopwords The list is based on an earlier publication with modification by me. All the relevant info is on github. Disclaimer 1: The list has not been validated by an expert. Disclaimer 2: There are more stop-words lists on the internet, but they are less complete and they also use ancient greek words. Furthermore, my testing showed that snowball needs to handle accents (tonous) and ς (teliko sigma) in a special way if you want the stemmer to work with capitalized words too. https://github.com/Xangis/extra-stopwords/blob/master/greek https://github.com/stopwords-iso/stopwords-el/tree/master/raw all the best, Panagiotis