Hi, Moving discussion to Lucene user list.
You may want to look at these references: * http://lucene.472066.n3.nabble.com/JLemmaGen-project-td4097466.html * https://github.com/Amice13/ukr_stemmer -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com > 18. feb. 2016 kl. 17.58 skrev Jack Krupansky <[email protected]>: > > Oops... just noticed - this discussion should be moved to the user mailing > list, not the "general" list. Sorry for not noticing earlier. > > -- Jack Krupansky > > On Thu, Feb 18, 2016 at 11:57 AM, Jack Krupansky <[email protected]> > wrote: > >> Wow, Ukraine language sure is a challenge due to all of the political and >> cultural forces over the centuries. See: >> https://en.wikipedia.org/wiki/Ukrainian_language >> >> So, the first question is what is the central focus of an interest in the >> Ukraine language - more focused on contemporary media (newspapers, >> magazines, government documents), literature, and social media (blog posts, >> tweets) in Kiev, or more on historic literature/books and official >> documents in the 20th Century? Or... what? >> >> Which dialect(s) are your central focus? e.g., Middle Dnieprian ("the >> basis of the Standard Literary Ukrainian")? >> >> Any examples to give for technical issues such as stemming, punctuation, >> word boundaries, compound words, stop words? Which modern language is >> Ukrainian most similar to... Russian? How similar, how dissimilar? >> >> >> -- Jack Krupansky >> >> On Thu, Feb 18, 2016 at 11:41 AM, Upayavira <[email protected]> wrote: >> >>> Nurul, >>> >>> You can search through JIRA [1] for Lucene issues regarding Ukrainian. I >>> didn't find anything to suggest anyone is working on it. >>> >>> What do you need Lucene to do that it currently doesn't? You may well be >>> able to get away with using another language, or a more generic, >>> non-language specific analysis for such languages. >>> >>> As to who to pay - there's no specific set of people - anyone who both >>> understands Lucene's internals, and understands (or can be helped to >>> understand) the needs of the Ukrainian language should be able to do the >>> work. >>> >>> Upayavira >>> >>> On Thu, Feb 18, 2016, at 03:55 PM, Nurul AMIN wrote: >>>> Hello Upayavira, >>>> >>>> Thanks for your email. >>>> >>>> In that case, can I know, if Lucene team is already working on >>>> "Ukrainian". If I need to pay, do you know how much is the cost and whom >>>> should I contact? >>>> >>>> Many thanks! >>>> >>>> Best regards, >>>> >>>> Nurul Amin >>>> Manager, Software Development, Service Technology Group (STG) >>>> Amadeus Customer Service (ACS) >>>> Amadeus s.a.s. >>>> France >>>> T: +33 4 97 23 03 82 >>>> >>>> Done is better than perfect! >>>> >>>> >>>> >>>> >>>> >>>> -----Original Message----- >>>> From: Upayavira [mailto:[email protected]] >>>> Sent: 18 February 2016 10:41 >>>> To: [email protected] >>>> Subject: Re: Lucene roadmap for language analyzers >>>> >>>> Nurul, >>>> >>>> Given the community based, meritocratic nature of the Lucene community, >>>> there is no 'roadmap' as such. Features are added when people need them >>>> and can justify developing them. >>>> >>>> The features you are requesting, if not present already, will be added >>>> when someone needs them sufficiently to implement them, or to pay >>>> someone to implement them. >>>> >>>> Upayavira >>>> >>>> >>>> On Wed, Feb 17, 2016, at 11:05 PM, Nurul AMIN wrote: >>>>> Hello, >>>>>> >>>> >>>> >>>>>> >>>> >>>> >>>>>> I do not find Lucene roadmap for language implementation. In fact, I >>>>>> am interested on the following languages >>>> >>>> >>>>>> -Ukrainian >>>> >>>> >>>>>> -Hebrew >>>> >>>> >>>>>> -Bahasa. >>>> >>>> >>>>>> >>>> >>>> >>>>>> Seems Lucene does not have those languages today >>>>>> ( >>> https://lucene.apache.org/core/5_4_1/analyzers-common/overview-summary.html >>> ) >>>> >>>> >>>>>> >>>> >>>> >>>>>> Do you know, if future versions of Lucene will bring those languages? >>>> >>>> >>>>>> >>>> >>>> >>>>>> Many thanks in advance for your help. >>>> >>>> >>>>>> >>>>>> >>>> Best regards, >>>> >>>> >>>>>> >>>> >>>> >>>>>> *Nurul Amin** >>>>>> >>>> Manager, Software Development, Service Technology Group (STG) * >>>>>> >>>> Amadeus Customer Service (ACS) >>>>>> >>>> Amadeus s.a.s. >>>>>> >>>> France >>>>>> >>>> T: +33 4 97 23 03 82 >>>> >>>>>> >>>>>> Done is better than perfect! >>>>>> >>>>>> __ >>>> >>>>>> behaviour-static-banner__ >>>> >>>> >>>>>> _ _ >>>> >>>> >>>>>> >>>> >>>> >>> >> >> --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
