I've been trying various things to try to make the indexing faster. I've
not been able to do successful searches when I don't do an optimize and
commit after adding each document. It does return a value, but not all of
the values I'm expecting. I've tried moving the commit to the end, which
makes it a ton faster but I expect to return 4 entries and I only get one.
I'm suspecting it has to do with the index being split into segments and
they're not merged at the end. Its only indexing 2k records but its
taking around an hour on my dual core laptop. I have tried using ramdisk
already to get rid of the i/o bottleneck if there is one, but it gave
about the same result. Any help would be appreciated.
Here's the basic code I'm using to add records:
Dim dir As New Store.SimpleFSDirectory(New
DirectoryInfo("c:\test"))
Dim anlz As New
StandardAnalyzer(Lucene.Net.Util.Version.LUCENE_29)
Dim indx As New Index.IndexWriter(dir, anlz, True,
IndexWriter.MaxFieldLength.UNLIMITED)
Dim mergePolicy As MergePolicy
Dim logPolicy As LogMergePolicy
indx.SetUseCompoundFile(True)
mergePolicy = indx.GetMergePolicy
logPolicy = mergePolicy
logPolicy.SetNoCFSRatio(1)
indx.SetRAMBufferSizeMB(256)
intC = 0
Dim doc As New Documents.Document
While dbrs.EOF = False
intC = intC + 1
getNextFile
If intC = 1 Then
doc.Add(New Documents.Field("id", intC,
Documents.Field.Store.YES, Documents.Field.Index.NO))
doc.Add(New Documents.Field("path", strFile,
Documents.Field.Store.YES, Documents.Field.Index.NO))
doc.Add(New Documents.Field("body", strBody,
Documents.Field.Store.YES, Documents.Field.Index.ANALYZED))
Else
doc.GetField("id").SetValue(intC)
doc.GetField("path").SetValue(strFile)
doc.GetField("body").SetValue(strBody)
End If
indx.AddDocument(doc)
indx.Optimize(1)
indx.Commit()
End While
Here's the code I'm using to search:
Dim dir As New Store.SimpleFSDirectory(New
DirectoryInfo("c:\test"))
Dim IR As IndexReader = IndexReader.Open(dir, True)
Dim anlz As New
StandardAnalyzer(Lucene.Net.Util.Version.LUCENE_29)
Dim parser As New
QueryParsers.QueryParser(Lucene.Net.Util.Version.LUCENE_29, "body", anlz)
Dim query As Search.Query
Dim searcher As IndexSearcher
Dim resultDocs As TopDocs
Dim hits() As ScoreDoc
Dim hit As ScoreDoc
Dim score As Double
Dim i As Integer
query = parser.Parse("""I-A-05.50""")
searcher = New IndexSearcher(IR)
resultDocs = searcher.Search(query, IR.MaxDoc())
Console.WriteLine("Found " & resultDocs.TotalHits & "
results")
Dim doc As Documents.Document
hits = resultDocs.ScoreDocs
For Each hit In hits
doc = searcher.Doc(hit.Doc)
score = hit.Score
Console.WriteLine("Results num: " & i + 1 & "
score: " & score)
Console.WriteLine("ID: " & doc.Get("id"))
Console.WriteLine("Path: " & doc.Get("path"))
Next
searcher.Close()
dir.Close()