[htdig] infinite loop in doc2html.pl

Terry Luedtke Tue, 19 Sep 2000 06:56:03 -0700

Hello,

I ran into an infinite loop using doc2html.  When it parses a PDF document it tries to 
reassemble hyphenated words.  Unfortunately, I have documents that end with a dash, 
like"text-", so the loop spins forever looking for the other half of the word.  Adding 
a check for eof fixed it.

in sub try_text()

      while (<CAT>) {
        while ( m/[A-Za-z\300-\377]-\s*$/ && $set->{'hyph'}) {
          ($_ .= <CAT>) || last;
          s/([A-Za-z\300-\377])-\s*\n\s*([A-Za-z\300-\377])/$1$2/s;
        }
--
      while (<CAT>) {
        while ( m/[A-Za-z\300-\377]-\s*$/ && $set->{'hyph'}) {
          ($_ .= <CAT>) || last;
          s/([A-Za-z\300-\377])-\s*\n\s*([A-Za-z\300-\377])/$1$2/s;
+          last if eof;
        }


Terry Luedtke
National Library of Medicine





------------------------------------
To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.
List archives:  <http://www.htdig.org/mail/menu.html>
FAQ:            <http://www.htdig.org/FAQ.html>

[htdig] infinite loop in doc2html.pl

Reply via email to