Re: [HACKERS] new full text search configurations

2015-11-21 Thread Emre Hasegeli
> I checked new snowball site http://snowballstem.org/ and found several new
> stemmers appeared (as external contributions):
>
> Irish and Czech
> Object Pascal codegenerator for Snowball
> Two stemmers for Romanian
> Hungarian
> Turkish
> Armenian
> Basque (Euskera)
> Catalan
>
> Some of them we don't have in our list of default configurations. Since
> these are external, not official stemmers, it'd be nice if  people  look and
> test them. If they are fine, we can prepare new configurations for 9.6.

We have configurations for the ones included to the Snowball, namely
Romanian, Hungarian, and Turkish.  I don't know why the others are not
included but listed on the page as external contributions.  It might
be a good idea to wait for someone to commit them to the upstream.

I have checked the changes on the algorithms [1].  They don't seemed
to be updated much after 2007, but recently a new one for Tamil
language is added.  It might be a good candidate for a new
configuration.

[1] https://github.com/snowballstem/snowball/commits/master/algorithms


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] new full text search configurations

2015-11-17 Thread Oleg Bartunov
I checked new snowball site http://snowballstem.org/ and found several new
stemmers appeared (as external contributions):


   - Irish and Czech 
   - Object Pascal codegenerator for Snowball
   
   - Two stemmers for Romanian 
   - Hungarian 
   - Turkish 
   - Armenian 
   - Basque (Euskera)
   
   - Catalan 

Some of them we don't have in our list of default configurations. Since
these are external, not official stemmers, it'd be nice if  people  look
and test them. If they are fine, we can prepare new configurations for 9.6.

 \dF
   List of text search configurations
   Schema   |Name|  Description
++---
 pg_catalog | danish | configuration for danish language
 pg_catalog | dutch  | configuration for dutch language
 pg_catalog | english| configuration for english language
 pg_catalog | finnish| configuration for finnish language
 pg_catalog | french | configuration for french language
 pg_catalog | german | configuration for german language
 pg_catalog | hungarian  | configuration for hungarian language
 pg_catalog | italian| configuration for italian language
 pg_catalog | norwegian  | configuration for norwegian language
 pg_catalog | portuguese | configuration for portuguese language
 pg_catalog | romanian   | configuration for romanian language
 pg_catalog | russian| configuration for russian language
 pg_catalog | simple | simple configuration
 pg_catalog | spanish| configuration for spanish language
 pg_catalog | swedish| configuration for swedish language
 pg_catalog | turkish| configuration for turkish language
 public | english_ns |
(17 rows)


Re: [HACKERS] new full text search configurations

2015-11-17 Thread Pavel Stehule
Hi

2015-11-17 17:28 GMT+01:00 Oleg Bartunov :

> I checked new snowball site http://snowballstem.org/ and found several
> new stemmers appeared (as external contributions):
>
>
>- Irish and Czech 
>
> Czech snowball needs recheck - 5 years ago it was not success in my tests

Regards

Pavel



>
>- Object Pascal codegenerator for Snowball
>
>- Two stemmers for Romanian
>
>- Hungarian 
>- Turkish 
>- Armenian 
>- Basque (Euskera)
>
>- Catalan 
>
> Some of them we don't have in our list of default configurations. Since
> these are external, not official stemmers, it'd be nice if  people  look
> and test them. If they are fine, we can prepare new configurations for 9.6.
>
>  \dF
>List of text search configurations
>Schema   |Name|  Description
> ++---
>  pg_catalog | danish | configuration for danish language
>  pg_catalog | dutch  | configuration for dutch language
>  pg_catalog | english| configuration for english language
>  pg_catalog | finnish| configuration for finnish language
>  pg_catalog | french | configuration for french language
>  pg_catalog | german | configuration for german language
>  pg_catalog | hungarian  | configuration for hungarian language
>  pg_catalog | italian| configuration for italian language
>  pg_catalog | norwegian  | configuration for norwegian language
>  pg_catalog | portuguese | configuration for portuguese language
>  pg_catalog | romanian   | configuration for romanian language
>  pg_catalog | russian| configuration for russian language
>  pg_catalog | simple | simple configuration
>  pg_catalog | spanish| configuration for spanish language
>  pg_catalog | swedish| configuration for swedish language
>  pg_catalog | turkish| configuration for turkish language
>  public | english_ns |
> (17 rows)
>