Hi, You describe two separate problems; indexing speed and search issues.
Have you done any cpu profiling to determine where to begin looking for your slow indexing speed? It sounds like you're ruled out i/o bottleneck, but it could still be a slow database you're reading from. Try simplify your code by removing references to merge policies (the default policies should be enough) and create new Document/Field instance instead of reusing them. Also, move that Optimize and Commit call outside your While loop. Your search issues is probably due to your use of StandardAnalyzer. It does not know the secret meaning of "I-A-05.50" (Product number? Secret identifier?) and will tokenize that into "I" and "05.50". The "A" will be skipped as it is a default stopword. I have to admit a lack of knowledge regarding StandardAnalyzer's use of positional information. You're currently searching for the phrase "I 05.50" or "I [anything] 05.50". Could you provide some example data which you expect to match, but isn't returned by your IndexSearcher? // Simon -----Original Message----- From: shane.bump...@ineos.com [mailto:shane.bump...@ineos.com] Sent: Monday, May 14, 2012 4:22 PM To: lucene-net-user@lucene.apache.org Subject: Question on basic functionality I've been trying various things to try to make the indexing faster. I've not been able to do successful searches when I don't do an optimize and commit after adding each document. It does return a value, but not all of the values I'm expecting. I've tried moving the commit to the end, which makes it a ton faster but I expect to return 4 entries and I only get one. I'm suspecting it has to do with the index being split into segments and they're not merged at the end. Its only indexing 2k records but its taking around an hour on my dual core laptop. I have tried using ramdisk already to get rid of the i/o bottleneck if there is one, but it gave about the same result. Any help would be appreciated. Here's the basic code I'm using to add records: Dim dir As New Store.SimpleFSDirectory(New DirectoryInfo("c:\test")) Dim anlz As New StandardAnalyzer(Lucene.Net.Util.Version.LUCENE_29) Dim indx As New Index.IndexWriter(dir, anlz, True, IndexWriter.MaxFieldLength.UNLIMITED) Dim mergePolicy As MergePolicy Dim logPolicy As LogMergePolicy indx.SetUseCompoundFile(True) mergePolicy = indx.GetMergePolicy logPolicy = mergePolicy logPolicy.SetNoCFSRatio(1) indx.SetRAMBufferSizeMB(256) intC = 0 Dim doc As New Documents.Document While dbrs.EOF = False intC = intC + 1 getNextFile If intC = 1 Then doc.Add(New Documents.Field("id", intC, Documents.Field.Store.YES, Documents.Field.Index.NO)) doc.Add(New Documents.Field("path", strFile, Documents.Field.Store.YES, Documents.Field.Index.NO)) doc.Add(New Documents.Field("body", strBody, Documents.Field.Store.YES, Documents.Field.Index.ANALYZED)) Else doc.GetField("id").SetValue(intC) doc.GetField("path").SetValue(strFile) doc.GetField("body").SetValue(strBody) End If indx.AddDocument(doc) indx.Optimize(1) indx.Commit() End While Here's the code I'm using to search: Dim dir As New Store.SimpleFSDirectory(New DirectoryInfo("c:\test")) Dim IR As IndexReader = IndexReader.Open(dir, True) Dim anlz As New StandardAnalyzer(Lucene.Net.Util.Version.LUCENE_29) Dim parser As New QueryParsers.QueryParser(Lucene.Net.Util.Version.LUCENE_29, "body", anlz) Dim query As Search.Query Dim searcher As IndexSearcher Dim resultDocs As TopDocs Dim hits() As ScoreDoc Dim hit As ScoreDoc Dim score As Double Dim i As Integer query = parser.Parse("""I-A-05.50""") searcher = New IndexSearcher(IR) resultDocs = searcher.Search(query, IR.MaxDoc()) Console.WriteLine("Found " & resultDocs.TotalHits & " results") Dim doc As Documents.Document hits = resultDocs.ScoreDocs For Each hit In hits doc = searcher.Doc(hit.Doc) score = hit.Score Console.WriteLine("Results num: " & i + 1 & " score: " & score) Console.WriteLine("ID: " & doc.Get("id")) Console.WriteLine("Path: " & doc.Get("path")) Next searcher.Close() dir.Close()