Hi, I'am facing some problems in using Lucene. The index I am using is constructed like this:
try { Analyzer analyzer = new SnowballAnalyzer(Version.LUCENE_30, "English"); Directory dir = MMapDirectory.open(index); IndexWriter writer = new IndexWriter(dir, analyzer, MaxFieldLength.LIMITED); searcher = new IndexSearcher(dir); Document luceneDocument; int numClusters = clustering.getClusterCount(); String[] clusterLabels = clustering.getClusterLabels(); for (int cluId = 0; cluId != numClusters; ++cluId) { int[] docIds = clustering.getItemsOfCluster(cluId); for (int docId : docIds) { luceneDocument = new Document(); luceneDocument.add(new NumericField("id", Field.Store.YES, true).setIntValue(docId)); luceneDocument.add(new NumericField("cluster_id", Field.Store.YES, true).setIntValue(cluId)); luceneDocument.add(new Field( "plaintext", texts.get(docId), Field.Store.NO, Field.Index.ANALYZED, Field.TermVector.YES)); luceneDocument.add(new Field( "label", clusterLabels[cluId], Field.Store.YES, Field.Index.ANALYZED, Field.TermVector.YES)); writer.addDocument(luceneDocument); } } writer.optimize(); writer.close(); } catch (IOException e) { e.printStackTrace(); } Then, while the Java application is running, the speed of Lucene is good. I can sift through about 11,000 categories in a few minutes. However, if I restart the application and read in the previous created Lucene index instead of generating a new one via: try { Directory dir = MMapDirectory.open(index); searcher = new IndexSearcher(dir); } catch (CorruptIndexException e) { e.printStackTrace(); } catch (IOException e) { e.printStackTrace(); } Now, only about 10 categories are examined within a few minutes instead of 11,000 categories like before. Subsequently, my question is why the access to Lucene is very slow in the second case. A usually query looks like this: BooleanQuery booleanQuery = new BooleanQuery(); Term luceneTerm = new Term(PLAINTEXT, stemmer.process(candidate)); TermQuery termQuery = new TermQuery(luceneTerm); booleanQuery.add(termQuery, BooleanClause.Occur.MUST); NumericRangeQuery<Integer> lTerm = NumericRangeQuery.newIntRange(CLUSTER_ID, clusterId, clusterId, true, true); booleanQuery.add(lTerm, BooleanClause.Occur.MUST); TopDocs resultSet = queryIndex(searcher, booleanQuery); Thank you! -- View this message in context: http://lucene.472066.n3.nabble.com/IndexSearch-very-slow-after-reopening-the-index-tp1699711p1699711.html Sent from the Lucene - Java Users mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org