Hi, You may look out for plagirism detectors.
My approach would be : 1. Hash all the keywords in one file and keep the count. 2. For each keyword in the other file , check if it exists in the hash table , decrement its count. Also increment a counter which represents the similarity between the two docs. For percentage you might also count the total keywords in the second doc and do "found keywords"/ total keywords. On Wed, Jul 6, 2011 at 11:41 AM, Navneet Gupta <navneetn...@gmail.com>wrote: > See diff documentation. It's an application of Longest Common > Subsequence problem. > http://en.wikipedia.org/wiki/Diff > > On Wed, Jul 6, 2011 at 11:12 AM, priyanshu <priyanshuro...@gmail.com> > wrote: > > What is the most efficient way to compare two text documents?? Also we > > need to find the percentage by which they match.. > > > > Thanks, > > priyanshu > > > > -- > > You received this message because you are subscribed to the Google Groups > "Algorithm Geeks" group. > > To post to this group, send email to algogeeks@googlegroups.com. > > To unsubscribe from this group, send email to > algogeeks+unsubscr...@googlegroups.com. > > For more options, visit this group at > http://groups.google.com/group/algogeeks?hl=en. > > > > > > > > -- > Navneet > > -- > You received this message because you are subscribed to the Google Groups > "Algorithm Geeks" group. > To post to this group, send email to algogeeks@googlegroups.com. > To unsubscribe from this group, send email to > algogeeks+unsubscr...@googlegroups.com. > For more options, visit this group at > http://groups.google.com/group/algogeeks?hl=en. > > -- regards, chinna. -- You received this message because you are subscribed to the Google Groups "Algorithm Geeks" group. To post to this group, send email to algogeeks@googlegroups.com. To unsubscribe from this group, send email to algogeeks+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/algogeeks?hl=en.