Hello Hoss, Thanks for your reply :-) I believe I'm in the first case: "to be able to search for 'foo' and get back a list of all sessions where the word 'foo' was used". However, I want to be able to separate free text search from field-based search.
I have put both the session and messages as documents, the session document for free text search and the messages for field based search: The algorithm that I've ended up using since I posted the initial message is: o execute the search on messages and documents, then on all hits o construct a list of 'filename's that match and show the last 10 results by newest. This works, but I'm afraid is not going to be performant when I end up indexing all sessions. There must be a way to get the right hit-set from a search. But in all cases, I'm looking at Solr for potential answers, thanks for mentioning it :-) Ta. Jo On Thu, Aug 7, 2008 at 12:59 AM, Chris Hostetter <[EMAIL PROTECTED]>wrote: > > : In addition to the full text search, I'd like to be able to perform > searches > : such as: > : - list sessions from:xxx timestamp:200808* > : - list sessions (from:xxx OR from:yyy) > : - etc > : > : Would it be better to store each message as a separate document with its > : fields, adding the 'filename' (session identifier) as an extra field? or > : maybe is there a better way of doing it making the session file a > document? > > As a general rule of thumb, you make 1 document for each result you want > to get back when you execute a search ... if you want to be able to search > for "foo" and get back a list of all sessions where the word "foo" was > used, then each session should be a document. If you also want to be able > to search for "foo" and get back a list of each message thta contained the > word "foo", then each message can also be a document -- either in another > index, or even in the same index (here's no rule that says all documents > must have the same fields) > > BTW: If you are planning on experimenting with the Java API, i would > suggest sending any specific followup questions to the [EMAIL PROTECTED] > list. But you may also want to consider checking out Solr, and the > solr-user list. Depends on what level of abstraction you want to deal > with (Solr provides a config based web service type front end for dealing > with Lucene indexes, but also has a Java API both for indexing and for > hoooking in custom functionality when executing searches) > > > -Hoss > >
