On Sun, Apr 24, 2011 at 8:30 AM, Grant Ingersoll <gsing...@apache.org> wrote: > > On Apr 21, 2011, at 5:02 PM, Clemens Wyss wrote: > >> I keep my search terms in a dedicated RAMDirectory (the termIndex). >> In there I palce all the term of my real index. When putting the terms into >> the >> termIndex I can still see [using the debugger] the Umlaute (äöü). >> Unfortunately when searching the >> termIndex the documents no more contain these Umlaute. >> >> Populating the termIndex: >> termIndex = new RAMDirectory(); >> IndexWriterConfig config = new IndexWriterConfig( Version.LUCENE_31, new >> TermAnalyzer( locale ) ); >> termIndexWriter = new IndexWriter( termIndex, config ); >> TermEnum tEnum = realIndexReader.terms(); >> while ( tEnum.next() ) >> { >> Term t = tEnum.term(); >> String termText = t.text(); >> Document termDocument = new Document(); >> Field field = new Field( FIELDNAME_TERM, termText, Field.Store.YES, >> Field.Index.ANALYZED ); >> termDocument.add( field ); >> // and add term into the index >> termIndexWriter.addDocument( termDocument ); >> } >> termIndexWriter.commit(); >> termIndexWriter.optimize(); >> termIndexWriter.close(); >> >> termIndexReader = IndexReader.open( termIndex, true ); >> ---------- searching terms >> Query q = fuzzy ? new FuzzyQuery( new Term( FIELDNAME_TERM, >> termFilter.toLowerCase() ) ) : >> new WildcardQuery( new Term( >> FIELDNAME_TERM, "*" + termFilter.toLowerCase() + "*" ) ); >> TopDocs topDocs = new IndexSearcher( getTermIndexReader() ).search( q, 100 ); >> for ( ScoreDoc hit : topDocs.scoreDocs ) >> { >> Document doc = getTermIndexReader().document( hit.doc ); >> String indexTerm = doc.get( FIELDNAME_TERM ); >> if ( !returnValue.contains( indexTerm ) ) >> { >> returnValue.add( indexTerm ); >> } >> } >> ---------- >> The TermAbnalyzer is the same analyzer as the main index analyzer with the >> exception that a LowerCaseFilter is applied. > > What is the Analyzer for the Main Index? What is the tokenizer and token > filters used?
in other words, can you provide what TermAnalyzer is composed of? simon > > Out of curiosity, what is the problem you are trying to solve? > > -------------------------- > Grant Ingersoll > http://www.lucidimagination.com/ > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org