[jira] [Commented] (LUCENE-6875) New Serbian Filter

Nikola Smolenski (JIRA) Tue, 03 Nov 2015 04:02:55 -0800

    [ 
https://issues.apache.org/jira/browse/LUCENE-6875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14987157#comment-14987157
 ]


Nikola Smolenski commented on LUCENE-6875:
------------------------------------------

This is so ubiquitous that I can't find a reference. The official orthography 
of Serbian lists the two alphabets, but doesn't explicitly specify how to 
convert between them. You can see that various other software projects use the 
same conversion, for example GNU GetText 
http://cvs.savannah.gnu.org/viewvc/gettext/gettext-tools/src/filter-sr-latin.c?revision=1.4&root=gettext&view=markup
 or MediaWiki 
https://phabricator.wikimedia.org/diffusion/MW/browse/master/languages/classes/LanguageSr.php

I have never seen ISO 9 used in practice, and it wouldn't be useful here 
anyway, since no one would enter the queries in ISO 9.

> New Serbian Filter
> ------------------
>
>                 Key: LUCENE-6875
>                 URL: https://issues.apache.org/jira/browse/LUCENE-6875
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: modules/analysis
>            Reporter: Nikola Smolenski
>            Priority: Minor
>         Attachments: Lucene-Serbian-Regular.patch
>
>
> This is a new Serbian filter that works with regular Latin text (the current 
> filter works with "bald" Latin). I described in detail what does it do and 
> why is it necessary at the wiki.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (LUCENE-6875) New Serbian Filter

Reply via email to