Thanks this might do it, but do I need to know the terms beforehand, I just want to return any terms with frequency more than one?

Erick Erickson wrote:
Sure, you can use the TermDocs/TermEnum classes. Basically, for a term (probably column value in your app) these let you quickly answer the question "which (and how many) documents does this term appear in". What you get is the Lucene doc id, which let's you fetch all the information about the documents you want.

Erick

On 2/23/07, *Paul Taylor* <[EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]>> wrote:

    Hi I have Java Swing application with a table, I was considering using
    Lucene to index the data in the table. One task Id like to do is
    for the
    user to select 'Find Duplicate records for Column X', then I would
    filter the table to show only records where there is more than one
    with
    the same value i.e duplicate for that column. Is there a way to return
    all the duplicates from a Lucene index.

    thanks paul Taylor

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: [EMAIL PROTECTED]
    <mailto:[EMAIL PROTECTED]>
    For additional commands, e-mail: [EMAIL PROTECTED]
    <mailto:[EMAIL PROTECTED]>


------------------------------------------------------------------------

Internal Virus Database is out-of-date.
Checked by AVG Free Edition.
Version: 7.1.394 / Virus Database: 268.16.5/616 - Release Date: 04/01/2007


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to