On Thu, Jul 4, 2019 at 1:39 PM Peter Eisentraut <
peter.eisentr...@2ndquadrant.com> wrote:

> On 2019-03-25 12:04, Panagiotis Mavrogiorgos wrote:
> > Last November snowball added support for Greek language [1]. Following
> > the instructions [2], I wrote a patch that adds fulltext search for
> > Greek in Postgres. The patch is attached.
>
> I have committed a full sync from the upstream snowball repository,
> which pulled in the new greek stemmer.
>
> Could you please clarify where you got the stopword list from?  The
> README says those need to be downloaded separately, but I wasn't able to
> find the download location.  It would be good to document this, for
> example in the commit message.  I haven't committed the stopword list yet.
>

Thank you Peter,

Here is the repo with the stop-words:
https://github.com/pmav99/greek_stopwords
The list is based on an earlier publication with modification by me. All
the relevant info is on github.

Disclaimer 1: The list has not been validated by an expert.

Disclaimer 2: There are more stop-words lists on the internet, but they are
less complete and they also use ancient greek words. Furthermore, my
testing showed that snowball needs to handle accents (tonous) and ς (teliko
sigma) in a special way if you want the stemmer to work with capitalized
words too.

https://github.com/Xangis/extra-stopwords/blob/master/greek
https://github.com/stopwords-iso/stopwords-el/tree/master/raw

all the best,
Panagiotis

Reply via email to