On Tue, Dec 16, 2008 at 2:09 AM, ayyanar <[email protected]> wrote: > Hi, Kindly share your thoughts on Why lucene and why not SQL?
Possible scenario: You have 200,000 text documents to search. You need to find all documents that contain the words "baseball" and "pitchers". In SQL you would say where (text like '%baseball%' and text like '%pitchers%'), and the query could take a very long time, because that kind of search cannot use a sql index for performance. In Lucene, it would be able to very quickly find what documents mention those words, because it has an index based on the individual words found. In Lucene, you would also be able to say "baseball pitchers"~5 to find just those documents where the words are close together (only 5 words apart maximum). In SQL you cannot do a proximity search, even with a sql full text index. This becomes even more apparent the larger the document set gets. SQL can search a small number of documents fairly well, but with very many documents, it gets much slower. Lucene stays fast. SQL is fairly useful for short text fields with limited contents, that can be indexed. Lucene is good for bigger full texts and very many documents. Jenny Brown
