What you lose by aggregating all real fields into 1 field is the ability to give fields different scoring weights. Is a match in the post title equally important as a match in the body or in one of the comments? If yes, then aggregate.
Otis ---- Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: http://search-lucene.com/ ----- Original Message ---- > From: Bob Eastbrook <baconeater...@gmail.com> > To: general@lucene.apache.org > Sent: Mon, May 17, 2010 12:49:32 AM > Subject: Should I avoid MultiFieldQueryParser? > > Imagine a blog that needs to be searched. I first thought I'd > index posts and comments using these > fields: BlogPostTitle BlogPostContent BlogComment There > could be any number of BlogComments. I have this working fine and use > MultiFieldQueryParser to generate a query. It seems to work. A > search for "picnic" matches that term in post titles, post contents, and > comments. However, "Lucene in Action" (2nd edition MEAP proof, chapter 5 > section 4) seems to advocate against using MultiFieldQueryParser and > instead suggests using a single synthetic field to hold all searchable > text. Perhaps this field would be called "contents" or "keywords". Is > this accepted to be a best practice? Should I dump a BlogPostTitle, > BlogPostContent, and its BlogComments into a single field? Bob