hi Daniel,
How do you use a separate database to check the duplicate fields? It is
interesting!
Best Regards.
jacky
----- Original Message -----
From: "Daniel Noll" <[EMAIL PROTECTED]>
To: <[email protected]>
Sent: Friday, September 08, 2006 3:08 PM
Subject: Re: duplicate fields
> jacky wrote:
> > hi, 1. Is there an effect method to check if there exists the same
> > field(hold a unique ID) when added into lucene index database? Make a
> > search for this field?
>
> One way is to create an IndexReader and IndexSearcher on your index,
> which you reopen every now and then. But we do this task by using a
> separate database, for the sake of efficiency.
>
> > 2. Is there an effect method to check if there exists the duplicate
> > fields(hold a unique ID) in the lucene index database? Two methods:
> > Read all documents and compare the fields, or search for each field.
> > Is there a better one?
>
> The simplest way without using an external database is to use the
> termDocs enumeration. For each term you can easily see which ones have
> multiple documents, so every document other than the first for each term
> is a duplicate (which you could then use to build a filter to remove
> duplicates.)
>
> Daniel
>
>
>
> --
> Daniel Noll
>
> Nuix Pty Ltd
> Suite 79, 89 Jones St, Ultimo NSW 2007, Australia Ph: +61 2 9280 0699
> Web: http://www.nuix.com.au/ Fax: +61 2 9212 6902
>
> This message is intended only for the named recipient. If you are not
> the intended recipient you are notified that disclosing, copying,
> distributing or taking any action in reliance on the contents of this
> message or attachment is strictly prohibited.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>