Unless you provide details on how you are indexing these documents, it's pretty hard to help.
It's also hard to reconcile your statement that OR is the default operator with the results you posted, the '+' all over the place really points to AND as the default. There's no magic in Lucene that will automatically put the "content" of an (X)HTM document in the content field of your document, how are you insuring that the doc is indexed as you expect? Luke is a very valuable tool for inspecting your index to see if it is what you think it is... Best Erick On Sat, Dec 11, 2010 at 8:34 PM, Celso Fontes <cels...@gmail.com> wrote: > Hi, i have the same text in two files: > > ****TXT file: http://pastebin.com/u9Rd9VVA > ****(X)HTM file: http://pastebin.com/ydHmTQZ8 > > And i running this Question: > > APC (adenomatous polyposis coli) actin assembly > > with OR operator and SNOWBALL Analyser results in: > > +content:apc +(+content:adenomat +content:polyposi +content:coli) > +content:actin +content:assembl > > > But... only txt returns ok, why? > > > ps: if i try without "()" i got the same result.... > Thanks, > Celso > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > >