Bart Lateur wrote:

> The problem is this: given two versions of a word, one without, and one
> with hyphenation, determine which hyphens are soft hyphens (optional
> breakpoints), and which ones are hard hyphens? For example:
> 
>       hypo-allergeen  hy-po-al-ler-geen
> 
> (If you wonder about the hyphenation rules: Dutch)
> 
> Here, all hyphens are soft hyphens, except the one between the "o" and
> the "a", which is required (I guess. I'm not 100% about the spelling,
> people seem to disagree on that one), or at least, let's suppose so.
> 
> So, a short and sweet snippet that figures this out, please? The result
> may be whatever form you like.

#!/usr/bin/perl -w
use strict;

my ($hard,
    $soft) =     @ARGV    ;
my  $show  = "   $soft   ";

for (my $i = 0; length $soft; $i++) {
  if   ( $hard !~ m/^-/ 
  and    $soft =~ m/^-/) { print substr($show, $i, 7), "  $i\n" }
  else { $hard =~ s/^.// }
         $soft =~ s/^.//
}

__END__

[xenon ~/d/Temporary Stuff]% perl hyphen hypo-allergeen hy-po-al-ler-geen
 hy-po-  2
-al-ler  8
ler-gee  12

-- 
Kevin Reid

Reply via email to