Hi there. I am new to Lucene and I have been developing a semantic
application for a while and it appears to me Lucene could help me to get
a much needed search with reasonable speed. I have some general question
to start:
1) Since my app is virtually all metadata, what should I store in the
indexes if anything?
2) Should I only index the most common properties that people will
search and combine the rest (and index this combined text as a field)?
3) I would like to sort and filter results but am concerned this could
be very memory intensive
4) Some general guidance on organizing indexes in an app would be
appreciated.
My schema is fairly large but I generally expect people to search on
about 6 to 8 properties for the most part. I have the data stored in an
sql database but not in a conventional way. I am willing to accept a
slower advanced search on less common properties (accomodating this with
sql search) but I really want some speed for the main properties with
full text search.
Pretty much everything in the app is metadata so I am most interested in
focussing on the 6-8 properties that people will use to search on for
the most part. I am thinking of combining the text of the remaining
properties (quite a number) into a single description type field so that
essentially all information gets indexed and ranked. Is this a
reasonable approach?
I see that there are advanced possibilities with the indexes to sort and
filter. How advisable is using sort for large record sets. For example,
say you have got 20000 records returned from your search. Because this
will have a web interface I will only be showing first 20 likely so I
will be batching results. Is the sorting filtering highly memory intensive?
Hopefully, someone can provide some initial advice. Many thanks.
Regards,
David
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]