Hello,
I ran into an infinite loop using doc2html. When it parses a PDF document it tries to
reassemble hyphenated words. Unfortunately, I have documents that end with a dash,
like"text-", so the loop spins forever looking for the other half of the word. Adding
a check for eof fixed it.
in sub try_text()
while (<CAT>) {
while ( m/[A-Za-z\300-\377]-\s*$/ && $set->{'hyph'}) {
($_ .= <CAT>) || last;
s/([A-Za-z\300-\377])-\s*\n\s*([A-Za-z\300-\377])/$1$2/s;
}
--
while (<CAT>) {
while ( m/[A-Za-z\300-\377]-\s*$/ && $set->{'hyph'}) {
($_ .= <CAT>) || last;
s/([A-Za-z\300-\377])-\s*\n\s*([A-Za-z\300-\377])/$1$2/s;
+ last if eof;
}
Terry Luedtke
National Library of Medicine
------------------------------------
To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.
List archives: <http://www.htdig.org/mail/menu.html>
FAQ: <http://www.htdig.org/FAQ.html>