Searching a large index

Derrick Okundaye Thu, 15 Feb 2007 03:35:20 -0800

Hi All,


I have an index of about 1.3GB in size which contains the metadata plus
full text of some 18000 xml documents, which themselves have been
derived from articles stored in PDFs.

 

I have built my index to handle case-insensitive and accent-insensitive
searches, which has bloated the size of my index.  My client expects the
search to work like Google's (an abstract of the search-term matches
along side location of matching documents).

 

My problem arises when l carry out a search for a term that finds more
than 2000 matches.  I have used an implementation of HitCollector to
carry out the initial search returning fields that are small and that l
can use as primary keys.  I then requery the index using the ID of the
document in the index for records that are going to be displayed on the
webpage.  This has not solved my retrieval speed problems.  Infact, the
memory consumption is still huge and the returned of results to display
still slow.

 

Have anyone else experienced this and how have you gone about resolving
this?

 

My environment/tools are Lucene.Net 1.9 final007,
Highlighter.Net-2.0-001-07Jan07 ASP.NET 2.0 in C#.

 

Regards,

 

Derrick

 

Derrick Okundaye - .NetDeveloper

Thomas Telford Limited

 


_______________________________________________________________________

This communication, and the information it contains: (a) is intended for the 
person(s) or organisation(s) named above and for no other persons or 
organisations and (b) may be confidential and protected by law. Unauthorised 
use, copying or disclosure of any of it may be unlawful. 

This email has been scanned for all viruses by the MessageLabs SkyScan
service on behalf of the Institution of Civil Engineers. For more information 
on a proactive anti-virus service working
around the clock, around the globe, visit http://www.messagelabs.com
________________________________________________________________________

Searching a large index

Reply via email to