Re: Arabic analyser

2015-11-11 Thread David Murgatroyd
>So BasisTech works for the latest version of solr? Yes, our latest Arabic analyzer supports up through 5.3.x. But since the examples you give are names, it sounds like you might instead/also want our fuzzy name matcher which will find "عبد الله" not only with "عبدالله" but also with typos like

Re: Arabic analyser

2015-11-11 Thread Mahmoud Almokadem
Thank Alex, So BasisTech works for the latest version of solr? Sincerely, Mahmoud On Tue, Nov 10, 2015 at 5:28 PM, Alexandre Rafalovitch wrote: > If this is for a significant project and you are ready to pay for it, > BasisTech has commercial solutions in this area I

Re: Arabic analyser

2015-11-11 Thread Mahmoud Almokadem
Thank you very much David, It's wonderful and I will try it. On Wed, Nov 11, 2015 at 1:37 PM, David Murgatroyd wrote: > >So BasisTech works for the latest version of solr? > > Yes, our latest Arabic analyzer supports up through 5.3.x. But since the > examples you give are

Re: Arabic analyser

2015-11-10 Thread Alexandre Rafalovitch
If this is for a significant project and you are ready to pay for it, BasisTech has commercial solutions in this area I believe. Regards, Alex. Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter: http://www.solr-start.com/ On 10 November 2015 at 08:46, Mahmoud Almokadem

Re: Arabic analyser

2015-11-10 Thread Mahmoud Almokadem
Thanks Pual, Arabic analyser applying filters of normalisation and stemming only for single terms out of standard tokenzier. Gathering all synonyms will be hard work. Should I customise my Tokenizer to handle this case? Sincerely, Mahmoud On Tue, Nov 10, 2015 at 3:06 PM, Paul Libbrecht

Re: Arabic analyser

2015-11-10 Thread Paul Libbrecht
Mahmoud, there is an arabic analyzer: https://wiki.apache.org/solr/LanguageAnalysis#Arabic doesn't it do what you describe? Synonyms probably work there too. Paul > Mahmoud Almokadem > 9 novembre 2015 17:47 > Thanks Jack, > > This is a good solution, but we

Re: Arabic analyser

2015-11-09 Thread Jack Krupansky
Use an index-time (but not query time) synonym filter with a rule like: Abd Allah,Abdallah This will index the combined word in addition to the separate words. -- Jack Krupansky On Mon, Nov 9, 2015 at 4:48 AM, Mahmoud Almokadem wrote: > Hello, > > We are indexing

Re: Arabic analyser

2015-11-09 Thread Mahmoud Almokadem
Thanks Jack, This is a good solution, but we have more combinations that I think can’t be handled as synonyms like every word starts with ‘عبد’ ‘Abd’ and ‘أبو’ ‘Abo’. When using Standard tokenizer on ‘أبو بكر’ ‘Abo Bakr’, It’ll be tokenised to ‘أبو’ and ‘بكر’ and the filters will be applied