Yup, between the Java Lucene community/docs and the SOLR sourcecode I've been able to find out quite a bit.
At times I'm annoyed that Lucene.NET isn't more ".NET like", but it's generally useful that it matches up so well to Lucene. -Kurt From: Simone Busoli [mailto:[EMAIL PROTECTED] Sent: Tuesday, June 26, 2007 9:19 PM To: [email protected] Subject: Re: Back on this single write thing As far as I can see there's no .NET documentation, but since Lucene.Net looks like a class-per-class port of Lucene, and the indexes are in the same format and compatible, too, the docs are very good for Lucene.Net as well. Simone Patrick Burrows wrote: ooh, good resource, hadn't seen the wiki. ...of course, I've been staying away from all the java versions of lucene anyway for the express purpose of growing the .net specific documentation... but I'll look there. On 6/26/07, Simone Busoli <[EMAIL PROTECTED]><mailto:[EMAIL PROTECTED]> wrote: These topics in the Lucene (Java) documentation are discussed ( http://wiki.apache.org/lucene-java/UpdatingAnIndex), you should take a look there, as well as give a skim at Lucene In Action book. You may take a look at the IndexModifier class, too. Simone Kurt Mackey wrote: The answer to A is: basically because a delete is more like a Read than a Write, in the context of Lucene. :) Also, it doesn't really make much sense. For B: The "recommended" way to handle this is generally like this: 1. Get the batch of documents to update 2. Open a new IndexReader 3. Loop through the documents to delete them 4. Close the IndexReader you were using for deletes 5. Open your IndexWriter 6. Write your documents out 7. Close the IndexWriter 8. Reopen the "main" IndexSearcher so it sees the updated index I ended up implementing sort of a work queue for Index operations. My basic work units are: * Index - adds a document to be added/updated * Commit - commits anything to be written, using the steps above * Optimize - optimizes the index One issue I had with the above, though, was that I might end up adding new versions of the same document a few times before the commit. This resulted in duplicate entries, since the delete and write were disparate operations. Cramming everything into a hashtable made for an easy enough fix, but I hadn't actually thought about it before seeing duplicates. :) -Kurt -----Original Message----- From: Patrick Burrows [mailto:[EMAIL PROTECTED] <[EMAIL PROTECTED]><mailto:[EMAIL PROTECTED]>] Sent: Tuesday, June 26, 2007 7:48 PM To: [email protected]<mailto:[email protected]> Subject: Back on this single write thing I created a singleton IndexWriter, pasted below in case anyone else wants it [1]. But now I have a bit of a problem. Someone mentioned that I can't have my index readers delete either. Makes sense, since that is a write operation. I just realized that one of the processes I am moving to use the new singleton stuff is a "Refresh()" method. It loops through each document, deletes it (using an indexreader) and then immediately recreates it (using an indexwriter). A -- why aren't these methods (delete and add) part of the same class? B -- but, more importantly, (and less wining)... how do you handle this? >From my understanding you can't just update fields in an already indexed document. You have to delete it and then re-add it. This operation necessarily involves a Delete and an Add. Any thoughts would be helpful. [1] using System; using System.Collections.Generic; using System.IO; using System.Text; using FullTextSearch.Tasks.Properties; using Lucene.Net.Analysis; using Lucene.Net.Analysis.Standard; using Lucene.Net.Index; using Directory=Lucene.Net.Store.Directory; namespace FullTextSearch.Tasks { public sealed class IndexWriterSingleton : IndexWriter { private static readonly IndexWriterSingleton instance = new IndexWriterSingleton(Settings.Default.IndexPath, new StandardAnalyzer(), false); static readonly object lockhandle = new object(); static IndexWriterSingleton(){} public static IndexWriterSingleton Instance { get { return instance; } } public IndexWriterSingleton(FileInfo path, Analyzer a, bool create) : base(path, a, create){} public IndexWriterSingleton(string path, Analyzer a, bool create) : base(path, a, create){} public IndexWriterSingleton(Directory d, Analyzer a, bool create) : base(d, a, create){} public override void AddDocument(Lucene.Net.Documents.Document doc) { lock (lockhandle) { base.AddDocument(doc); } } public override void AddDocument(Lucene.Net.Documents.Document doc, Analyzer analyzer) { lock (lockhandle) { base.AddDocument(doc, analyzer); } } public override void Optimize() { lock (lockhandle) { base.Optimize(); } } } } -- - P
