What sort of text?

If it's long written stuff, I'd split it into paragraphs then loop through each 
paragraph and split into sentences, then loop through each sentence and split 
into words - count the number of matches of words in the related sentence in 
the original text and if a sentence a score based on how many matching words, 
and then combine the sentence scores to give paragraph scores, and paragraph 
scores to give overall scores.
Decide on a limit and if the total score limit is over that, consider it a 
minor change, else it's a major.

(Or something along those lines)

> > "Similar" is a rather vague description.  Are you looking for 
> overlap? 
> > Length?  Number of characters in the same location in each string? 
> > Number of similar characters?
> 
> Good point, pardon my lack of clarity.  I'm looking for matching text.  
> It would essentially be identical to a plaguarism-checking algorithm, 
> but I'd be using it for the opposite.
> 
> Ultimately, I'm trying to see if a customer made minor edits to the 
> text they entered, or if they changed the text completely.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~|
ColdFusion MX7 and Flex 2 
Build sales & marketing dashboard RIA’s for your business. Upgrade now
http://www.adobe.com/products/coldfusion/flex2

Archive: 
http://www.houseoffusion.com/groups/CF-Talk/message.cfm/messageid:271696
Subscription: http://www.houseoffusion.com/groups/CF-Talk/subscribe.cfm
Unsubscribe: 
http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=11502.10531.4

Reply via email to