Can some one pls help with the logic that can be applied to decide on the
closeness requirement given below (like 50% matching). This matching is a
pure text matching.

Since the current lucene score does not translate into the percentage of
closeness, is there anything else that can give this information?

Or is there any other route that can be taken instead of a percentage
matching?

Thanks

Saurabh

On Tue, Aug 30, 2011 at 11:55 AM, Saurabh Gokhale
<saurabhgokh...@gmail.com>wrote:

> Hi All,
>
> I need your help to understand how I can have Lucene applied to the
> following business scenario. Question is in RED
>
> *Business Scenario:*
> Analyze newly created document "A" with existing documents in the system
> and if document A matches more than (similar to) 50% with any of the
> existing documents, perform specific action.
>
> *Possible Lucene Implementation:*
> Requirement: Analyze newly created document A
> Action: Read name and the contents of the document A
>
> Requirement: Analyze new document with existing documents in the system
> Action: 1. Pre Index all the existing document and create lucene index. 2.
> Use class like MoreLikeThis to find similar documents for newly created
> document.
>
> Requirement: If match is above 50%, perform specific action
> Action: Since resulting lucene score for the match can not be directly
> converted into a percentage match (as the score value changes based on many
> factors) how can this requirement be satisfied?
>
> Thanks
>
> Saurabh
>

Reply via email to