You're welcome, I'm a newbie too,
just trying to get some things done with Lucene.Net, but I must say
that I don't like the API very much, it's too much Javaish and not very
user friendly, and indexing operations should be made more intuitive
imho. Java Lucene is much more actively developed and supported
unluckily.
Simone
Patrick Burrows wrote:
This really is a great resource. The FAQ is answering a
bunch of questions I
had, but didn't want to inundate the list with. Thanks Simone! :-)
On 6/26/07, Simone Busoli <[EMAIL PROTECTED]> wrote:
As far as I can see there's no .NET documentation, but since
Lucene.Netlooks like a class-per-class port of Lucene, and the indexes
are in the same
format and compatible, too, the docs are very good for Lucene.Net as
well.
Simone
Patrick Burrows wrote:
ooh, good resource, hadn't seen the wiki. ...of course, I've been
staying
away from all the java versions of lucene anyway for the express
purpose
of
growing the .net specific documentation... but I'll look there.
On 6/26/07, Simone Busoli
<[EMAIL PROTECTED]><[EMAIL PROTECTED]>wrote:
These topics in the Lucene (Java) documentation are discussed (
http://wiki.apache.org/lucene-java/UpdatingAnIndex), you should take a
look there, as well as give a skim at Lucene In Action book. You may
take
a
look at the IndexModifier class, too.
Simone
Kurt Mackey wrote:
The answer to A is: basically because a delete is more like a Read than
a
Write, in the context of Lucene. :) Also, it doesn't really make much
sense.
For B: The "recommended" way to handle this is generally like this:
1. Get the batch of documents to update
2. Open a new IndexReader
3. Loop through the documents to delete them
4. Close the IndexReader you were using for deletes
5. Open your IndexWriter
6. Write your documents out
7. Close the IndexWriter
8. Reopen the "main" IndexSearcher so it sees the updated index
I ended up implementing sort of a work queue for Index operations. My
basic work units are:
* Index - adds a document to be added/updated
* Commit - commits anything to be written, using the steps above
* Optimize - optimizes the index
One issue I had with the above, though, was that I might end up adding
new
versions of the same document a few times before the commit. This
resulted
in duplicate entries, since the delete and write were disparate
operations.
Cramming everything into a hashtable made for an easy enough fix, but I
hadn't actually thought about it before seeing duplicates. :)
-Kurt
-----Original Message-----
From: Patrick Burrows [mailto:[EMAIL PROTECTED]
<[EMAIL PROTECTED]>
<[EMAIL PROTECTED]> <[EMAIL PROTECTED]>]
Sent: Tuesday, June 26, 2007 7:48 PM
To: [email protected]
Subject: Back on this single write thing
I created a singleton IndexWriter, pasted below in case anyone else
wants
it [1]. But now I have a bit of a problem. Someone mentioned that I
can't
have my index readers delete either. Makes sense, since that is a write
operation.
I just realized that one of the processes I am moving to use the new
singleton stuff is a "Refresh()" method. It loops through each
document,
deletes it (using an indexreader) and then immediately recreates it
(using
an indexwriter).
A -- why aren't these methods (delete and add) part of the same class?
B -- but, more importantly, (and less wining)... how do you handle
this?
>From my understanding you can't just update fields in an already
indexed
document. You have to delete it and then re-add it. This operation
necessarily involves a Delete and an Add. Any thoughts would be
helpful.
[1]
using System;
using System.Collections.Generic;
using System.IO;
using System.Text;
using FullTextSearch.Tasks.Properties;
using Lucene.Net.Analysis;
using Lucene.Net.Analysis.Standard;
using Lucene.Net.Index;
using Directory=Lucene.Net.Store.Directory;
namespace FullTextSearch.Tasks
{
public sealed class IndexWriterSingleton : IndexWriter
{
private static readonly IndexWriterSingleton instance =
new IndexWriterSingleton(Settings.Default.IndexPath, new
StandardAnalyzer(), false);
static readonly object lockhandle = new object();
static IndexWriterSingleton(){}
public static IndexWriterSingleton Instance
{
get { return instance; }
}
public IndexWriterSingleton(FileInfo path, Analyzer a, bool
create)
: base(path, a, create){}
public IndexWriterSingleton(string path, Analyzer a, bool
create)
:
base(path, a, create){}
public IndexWriterSingleton(Directory d, Analyzer a, bool
create)
:
base(d, a, create){}
public override void
AddDocument(Lucene.Net.Documents.Documentdoc)
{
lock (lockhandle)
{
base.AddDocument(doc);
}
}
public override void
AddDocument(Lucene.Net.Documents.Documentdoc,
Analyzer analyzer)
{
lock (lockhandle)
{
base.AddDocument(doc, analyzer);
}
}
public override void Optimize()
{
lock (lockhandle)
{
base.Optimize();
}
}
}
}
--
-
P
|