18 sep 2007 kl. 23.23 skrev Liaqat Ali:

I m new to the field of Information Retrieval and now working to develop search engine for language like Arabic and Urdu. Kindly guide me in this regard that how can Lucene be utilized for this purpose.

Lucene makes no distinction between languages. All data is discrete chunks of characters, also known as tokens. Tokens are repsresented in fields, and the combination of a token in a specific field is known as a term. What tokens your index end up containing depends on the analyzer strategy you will be using. An analyzer could be language sensitive, it could also be something completely different.

Can anybody tell me exactly what I should do to design a search engine from the scratch using Lucene.

You need to define what your search engine is supposed to do in order to get an answer that makes sense.


Lucene in action is a pretty good book, even though it covers 1.4 or so. The SVN contains a demo application. There is also the Wiki and this forum.

--
karl

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to