Re: Feature: Add Greek language fulltext search

2019-07-11 Thread Adrien Nayrat
On 7/4/19 1:39 PM, Peter Eisentraut wrote:
> On 2019-03-25 12:04, Panagiotis Mavrogiorgos wrote:
>> Last November snowball added support for Greek language [1]. Following
>> the instructions [2], I wrote a patch that adds fulltext search for
>> Greek in Postgres. The patch is attached. 
> 
> I have committed a full sync from the upstream snowball repository,
> which pulled in the new greek stemmer.
> 
> Could you please clarify where you got the stopword list from?  The
> README says those need to be downloaded separately, but I wasn't able to
> find the download location.  It would be good to document this, for
> example in the commit message.  I haven't committed the stopword list yet.
> 

Thanks, I noted snowball pushed a new commit related to greek stemmer few days
after your sync:
https://github.com/snowballstem/snowball/commit/533602101f963eeb0c38343d94c428ceef740c0c

As it seems there is no policy for stable release on Snowball, I don't know what
is the best way to keep in sync :(




signature.asc
Description: OpenPGP digital signature


Re: Feature: Add Greek language fulltext search

2019-07-09 Thread Panagiotis Mavrogiorgos
On Thu, Jul 4, 2019 at 1:39 PM Peter Eisentraut <
peter.eisentr...@2ndquadrant.com> wrote:

> On 2019-03-25 12:04, Panagiotis Mavrogiorgos wrote:
> > Last November snowball added support for Greek language [1]. Following
> > the instructions [2], I wrote a patch that adds fulltext search for
> > Greek in Postgres. The patch is attached.
>
> I have committed a full sync from the upstream snowball repository,
> which pulled in the new greek stemmer.
>
> Could you please clarify where you got the stopword list from?  The
> README says those need to be downloaded separately, but I wasn't able to
> find the download location.  It would be good to document this, for
> example in the commit message.  I haven't committed the stopword list yet.
>

Thank you Peter,

Here is the repo with the stop-words:
https://github.com/pmav99/greek_stopwords
The list is based on an earlier publication with modification by me. All
the relevant info is on github.

Disclaimer 1: The list has not been validated by an expert.

Disclaimer 2: There are more stop-words lists on the internet, but they are
less complete and they also use ancient greek words. Furthermore, my
testing showed that snowball needs to handle accents (tonous) and ς (teliko
sigma) in a special way if you want the stemmer to work with capitalized
words too.

https://github.com/Xangis/extra-stopwords/blob/master/greek
https://github.com/stopwords-iso/stopwords-el/tree/master/raw

all the best,
Panagiotis


Re: Feature: Add Greek language fulltext search

2019-07-04 Thread Peter Eisentraut
On 2019-03-25 12:04, Panagiotis Mavrogiorgos wrote:
> Last November snowball added support for Greek language [1]. Following
> the instructions [2], I wrote a patch that adds fulltext search for
> Greek in Postgres. The patch is attached. 

I have committed a full sync from the upstream snowball repository,
which pulled in the new greek stemmer.

Could you please clarify where you got the stopword list from?  The
README says those need to be downloaded separately, but I wasn't able to
find the download location.  It would be good to document this, for
example in the commit message.  I haven't committed the stopword list yet.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: Feature: Add Greek language fulltext search

2019-03-25 Thread Tom Lane
Panagiotis Mavrogiorgos  writes:
> Last November snowball added support for Greek language [1]. Following the
> instructions [2], I wrote a patch that adds fulltext search for Greek in
> Postgres. The patch is attached.

Cool!

> I would appreciate any feedback that will help in getting this merged.

We're past the deadline for submitting features for v12, but please
register this patch in the first v13 commitfest so that we remember
about it when the time comes:

https://commitfest.postgresql.org/23/

regards, tom lane