Re: [HACKERS] new full text search configurations

2015-11-21 Thread Emre Hasegeli
> I checked new snowball site http://snowballstem.org/ and found several new
> stemmers appeared (as external contributions):
>
> Irish and Czech
> Object Pascal codegenerator for Snowball
> Two stemmers for Romanian
> Hungarian
> Turkish
> Armenian
> Basque (Euskera)
> Catalan
>
> Some of them we don't have in our list of default configurations. Since
> these are external, not official stemmers, it'd be nice if  people  look and
> test them. If they are fine, we can prepare new configurations for 9.6.

We have configurations for the ones included to the Snowball, namely
Romanian, Hungarian, and Turkish.  I don't know why the others are not
included but listed on the page as external contributions.  It might
be a good idea to wait for someone to commit them to the upstream.

I have checked the changes on the algorithms [1].  They don't seemed
to be updated much after 2007, but recently a new one for Tamil
language is added.  It might be a good candidate for a new
configuration.

[1] https://github.com/snowballstem/snowball/commits/master/algorithms


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] new full text search configurations

2015-11-17 Thread Pavel Stehule
Hi

2015-11-17 17:28 GMT+01:00 Oleg Bartunov :

> I checked new snowball site http://snowballstem.org/ and found several
> new stemmers appeared (as external contributions):
>
>
>- Irish and Czech 
>
> Czech snowball needs recheck - 5 years ago it was not success in my tests

Regards

Pavel



>
>- Object Pascal codegenerator for Snowball
>
>- Two stemmers for Romanian
>
>- Hungarian 
>- Turkish 
>- Armenian 
>- Basque (Euskera)
>
>- Catalan 
>
> Some of them we don't have in our list of default configurations. Since
> these are external, not official stemmers, it'd be nice if  people  look
> and test them. If they are fine, we can prepare new configurations for 9.6.
>
>  \dF
>List of text search configurations
>Schema   |Name|  Description
> ++---
>  pg_catalog | danish | configuration for danish language
>  pg_catalog | dutch  | configuration for dutch language
>  pg_catalog | english| configuration for english language
>  pg_catalog | finnish| configuration for finnish language
>  pg_catalog | french | configuration for french language
>  pg_catalog | german | configuration for german language
>  pg_catalog | hungarian  | configuration for hungarian language
>  pg_catalog | italian| configuration for italian language
>  pg_catalog | norwegian  | configuration for norwegian language
>  pg_catalog | portuguese | configuration for portuguese language
>  pg_catalog | romanian   | configuration for romanian language
>  pg_catalog | russian| configuration for russian language
>  pg_catalog | simple | simple configuration
>  pg_catalog | spanish| configuration for spanish language
>  pg_catalog | swedish| configuration for swedish language
>  pg_catalog | turkish| configuration for turkish language
>  public | english_ns |
> (17 rows)
>