Searching/sorting strategy for many properties for semantic web app

David Pratt Tue, 21 Feb 2006 19:20:36 -0800

Hi there. I am new to Lucene and I have been developing a semanticapplication for a while and it appears to me Lucene could help me to geta much needed search with reasonable speed. I have some general questionto start:

1) Since my app is virtually all metadata, what should I store in theindexes if anything?2) Should I only index the most common properties that people willsearch and combine the rest (and index this combined text as a field)?3) I would like to sort and filter results but am concerned this couldbe very memory intensive4) Some general guidance on organizing indexes in an app would beappreciated.

My schema is fairly large but I generally expect people to search onabout 6 to 8 properties for the most part. I have the data stored in ansql database but not in a conventional way. I am willing to accept aslower advanced search on less common properties (accomodating this withsql search) but I really want some speed for the main properties withfull text search.

Pretty much everything in the app is metadata so I am most interested infocussing on the 6-8 properties that people will use to search on forthe most part. I am thinking of combining the text of the remainingproperties (quite a number) into a single description type field so thatessentially all information gets indexed and ranked. Is this areasonable approach?

I see that there are advanced possibilities with the indexes to sort andfilter. How advisable is using sort for large record sets. For example,say you have got 20000 records returned from your search. Because thiswill have a web interface I will only be showing first 20 likely so Iwill be batching results. Is the sorting filtering highly memory intensive?


Hopefully, someone can provide some initial advice. Many thanks.

Regards,
David

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Searching/sorting strategy for many properties for semantic web app

Reply via email to