Bart Lateur wrote: > The problem is this: given two versions of a word, one without, and one > with hyphenation, determine which hyphens are soft hyphens (optional > breakpoints), and which ones are hard hyphens? For example: > > hypo-allergeen hy-po-al-ler-geen > > (If you wonder about the hyphenation rules: Dutch) > > Here, all hyphens are soft hyphens, except the one between the "o" and > the "a", which is required (I guess. I'm not 100% about the spelling, > people seem to disagree on that one), or at least, let's suppose so. > > So, a short and sweet snippet that figures this out, please? The result > may be whatever form you like.
#!/usr/bin/perl -w use strict; my ($hard, $soft) = @ARGV ; my $show = " $soft "; for (my $i = 0; length $soft; $i++) { if ( $hard !~ m/^-/ and $soft =~ m/^-/) { print substr($show, $i, 7), " $i\n" } else { $hard =~ s/^.// } $soft =~ s/^.// } __END__ [xenon ~/d/Temporary Stuff]% perl hyphen hypo-allergeen hy-po-al-ler-geen hy-po- 2 -al-ler 8 ler-gee 12 -- Kevin Reid