Lucene ancient greek normalization

2014-11-21 Thread paolo anghileri
For development purposes I need the ability in lucene to normalize ancient greek characters for al the cases of grammatical details such as accents, diacritics and so on. My need is to retrieve ancient greek words with accents and other grammatical details by the input of the string without

Re: Lucene ancient greek normalization

2014-11-21 Thread paolo anghileri
Sorry, forgot adding the link to lucene file: https://github.com/apache/lucene-solr/blob/trunk/lucene/analysis/common/src/java/org/apache/lucene/analysis/el/GreekLowerCaseFilter.java On 21/11/2014 20:14, paolo anghileri wrote: For development purposes I need the ability in lucene to normalize

Re: Lucene ancient greek normalization

2014-11-21 Thread Alexandre Rafalovitch
Are you sure that's not something that's already addressed by the ICU Filter? http://www.solr-start.com/javadoc/solr-lucene/org/apache/lucene/analysis/icu/ICUTransformFilterFactory.html If you follow the links to what's possible, the page talks about Greek, though not ancient:

RE: Lucene ancient greek normalization

2014-11-21 Thread Allison, Timothy B.
ICU looks promising: Μῆνιν ἄειδε, θεὰ, Πηληϊάδεω Ἀχιλλῆος - 1.μηνιν 2.αειδε 3.θεα 4.πηληιαδεω 5.αχιλληοσ -Original Message- From: Alexandre Rafalovitch [mailto:arafa...@gmail.com] Sent: Friday, November 21, 2014 3:08 PM To: dev@lucene.apache.org Subject: Re: Lucene ancient greek

Re: Lucene ancient greek normalization

2014-11-21 Thread paolo anghileri
Many thanks Alex, For clearness, I try explaining a bit what I would like to do: I'd like to use mediawiki as a base for this project. The need is being able to search with simple strings without grammatical details and retrieve data with grammatical details. For that, I am evaluating to use a

Re: Lucene ancient greek normalization

2014-11-21 Thread Alexandre Rafalovitch
On 21 November 2014 16:10, paolo anghileri paolo.anghil...@codegeneration.it wrote: The need is being able to search with simple strings without grammatical details and retrieve data with grammatical details. I am pretty sure that this is what I did for a Thai dome. Actually, I went another two