On Mar 24, 2004, at 4:31 AM, juman wrote:

I have two strings I want to compare doing some kind of fuzzy matching?
Is there some good way to that in perl or could someone help with a
routine matching word by word and giving a percental result.

Like

String 1 : This is a ten characters long string is it not
String 2 : This is not so long

String 1 compared to String 2 gives 40% (four words are the same)
String 2 compared to String 1 gives 80% (four word are the same)

See if this gives you some ideas:


#!/usr/bin/perl

use strict;
use warnings;

my $string1 = 'This is a ten characters long string is it not';
my $string2 = 'This is not so long';

print compare_words($string1, $string2), "%\n";
print compare_words($string2, $string1), "%\n";

sub compare_words {
        my($str1, $str2) = @_;
        
        my @words = split ' ', $str2;
        my $in_both_count = 0;
        my %seen;
        foreach (split ' ', $str1) {
                next if $seen{$_}++;
                $in_both_count++ if $str2 =~ m/\b$_\b/;
        }
        
        return sprintf '%.0f', $in_both_count / scalar(@words) * 100;
}

__END__

James


-- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] <http://learn.perl.org/> <http://learn.perl.org/first-response>




Reply via email to