Re: lucene for Arabic and Urdu

Karl Wettin Tue, 18 Sep 2007 14:45:35 -0700


18 sep 2007 kl. 23.23 skrev Liaqat Ali:

I m new to the field of Information Retrieval and now working todevelop search engine for language like Arabic and Urdu. Kindlyguide me in this regard that how can Lucene be utilized for thispurpose.

Lucene makes no distinction between languages. All data is discretechunks of characters, also known as tokens. Tokens are repsresentedin fields, and the combination of a token in a specific field isknown as a term. What tokens your index end up containing depends onthe analyzer strategy you will be using. An analyzer could belanguage sensitive, it could also be something completely different.

Can anybody tell me exactly what I should do to design a searchengine from the scratch using Lucene.

You need to define what your search engine is supposed to do in orderto get an answer that makes sense.

Lucene in action is a pretty good book, even though it covers 1.4 orso. The SVN contains a demo application. There is also the Wiki andthis forum.


--
karl

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: lucene for Arabic and Urdu

Reply via email to