How do you want to determine copy?  Strictly or loosely?  Solr and Nutch have 
some deduplication capabilities, including fuzzy matching.  They probably could 
be brought into Mahout, too.

-Grant

On Jul 7, 2010, at 10:23 AM, JAGANADH G wrote:

> Dear All
> 
> Is there any way or algo available to compare tow documents.
> Eg. Check if doc "A" is a copy (palagirised version) of document "B".
> 
> With regards
> 
> -- 
> **********************************
> JAGANADH G
> http://jaganadhg.freeflux.net/blog

Reply via email to