Daniel's advice is specific to wanting to build your own applicaiton using the underlying Lucene-Java API.
You may also be interested in taking a look at the Solr and Nutch subprojects, Nutch is a more traditional search engine type system that crawls and indexes documents, while Solr supports a robust mechanism for defining your own custom schema and pushing bulk data into it. Both can easily support a variety of languages ... the "hard work" in having good behavior with a particular language is delegated to what is in Lucene called an "Analyzer". Once you have a better understanding of the various Lucene projects, building an Arabic analyzer might be the best place to focus your efforts as you could then plug it in to either Nutch or Solr (or use it in a custom application built with Lucene-Java : > we are a group of undergraduate Computer Sciences and Information : > Systems students and we wanted to develop a search engine that supports : > the Arabic Language, but our supervisor suggested the "Lucene" instead : > of building the whole search engine from scratch and we actually liked : > the idea so here we are, the problem is I don't know how to start : See : http://wiki.apache.org/lucene-java/LuceneFAQ#head-fced767dd893d8828529074a26f99e0df7fe12ca -Hoss
