Jack, Thanks. Yeah, I don't know what you mean be term analysis. I googled it but didn't come up with much. So if that is the preferred way of doing this, a wiki document would be greatly appreciated.
I notice you did say I should be doing the term analysis first. But is it wrong to do it the way I described in my original email? Will it give me incorrect results? Bill -----Original Message----- From: Jack Krupansky [mailto:j...@basetechnology.com] Sent: Friday, August 03, 2012 9:33 AM To: java-user@lucene.apache.org Subject: Re: Analyzer on query question Bill, the simple answer to your original question is that in general you should apply the same or similar analysis for your query terms as you do with your indexed data. In your specific case the Query.toString is generating your unanalyzed terms and then the query parser is performing the needed analysis. The real point is that you should be doing the tem analysis before invoking "new Term". Alas, term analysis has changed dramatically over the past couple of years, so the solution to doing analysis before generating a Term/TermQuery will vary from Lucene release to release. We really do need a wiki page for Lucene term analysis. -- Jack Krupansky -----Original Message----- From: Bill Chesky Sent: Friday, August 03, 2012 9:19 AM To: simon.willna...@gmail.com ; java-user@lucene.apache.org Subject: RE: Analyzer on query question Thanks Simon, Unfortunately, I'm using Lucene 3.0.1 and CharTermAttribute doesn't seem to have been introduced until 3.1.0. Similarly my version of Lucene does not have a BooleanQuery.addClause(BooleanClause) method. Maybe you meant BooleanQuery.add(BooleanClause). In any case, most of what you're doing there, I'm just not familiar with. Seems very low level. I've never had to use TokenStreams to build a query before and I'm not really sure what is going on there. Also, I don't know what PositionIncrementAttribute is or how it would be used to create a PhraseQuery. The way I'm currently creating PhraseQuerys is very straightforward and intuitive. E.g. to search for the term "foo bar" I'd build the query like this: PhraseQuery phraseQuery = new PhraseQuery(); phraseQuery.add(new Term("title", "foo")); phraseQuery.add(new Term("title", "bar")); Is there really no easier way to associate the correct analyzer with these types of queries? Bill -----Original Message----- From: Simon Willnauer [mailto:simon.willna...@gmail.com] Sent: Friday, August 03, 2012 3:43 AM To: java-user@lucene.apache.org; Bill Chesky Subject: Re: Analyzer on query question On Thu, Aug 2, 2012 at 11:09 PM, Bill Chesky <bill.che...@learninga-z.com> wrote: > Hi, > > I understand that generally speaking you should use the same analyzer on > querying as was used on indexing. In my code I am using the > SnowballAnalyzer on index creation. However, on the query side I am > building up a complex BooleanQuery from other BooleanQuerys and/or > PhraseQuerys on several fields. None of these require specifying an > analyzer anywhere. This is causing some odd results, I think, because a > different analyzer (or no analyzer?) is being used for the query. > > Question: how do I build my boolean and phrase queries using the > SnowballAnalyzer? > > One thing I did that seemed to kind of work was to build my complex query > normally then build a snowball-analyzed query using a QueryParser > instantiated with a SnowballAnalyzer. To do this, I simply pass the > string value of the complex query to the QueryParser.parse() method to get > the new query. Something like this: > > // build a complex query from other BooleanQuerys and PhraseQuerys > BooleanQuery fullQuery = buildComplexQuery(); > QueryParser parser = new QueryParser(Version.LUCENE_30, "title", new > SnowballAnalyzer(Version.LUCENE_30, "English")); > Query snowballAnalyzedQuery = parser.parse(fullQuery.toString()); > > TopScoreDocCollector collector = TopScoreDocCollector.create(10000, > true); > indexSearcher.search(snowballAnalyzedQuery, collector); you can just use the analyzer directly like this: Analyzer analyzer = new SnowballAnalyzer(Version.LUCENE_30, "English"); TokenStream stream = analyzer.tokenStream("title", new StringReader(fullQuery.toString()): CharTermAttribute termAttr = stream.addAttribute(CharTermAttribute.class); stream.reset(); BooleanQuery q = new BooleanQuery(); while(stream.incrementToken()) { q.addClause(new BooleanClause(Occur.MUST, new Term("title", termAttr.toString()))); } you also have access to the token positions if you want to create phrase queries etc. just add a PositionIncrementAttribute like this: PositionIncrementAttribute posAttr = stream.addAttribute(PositionsIncrementAttribute.class); pls. doublecheck the code it's straight from the top of my head. simon > > Like I said, this seems to kind of work but it doesn't feel right. Does > this make sense? Is there a better way? > > thanks in advance, > > Bill ---------------------------------------------- T ususcib, -mil jvausr-nsbs...@ucneapch.ogfo adiioalcomads emal:jaa-se-hlpluen.aace.rg --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org