You might try putting the filehandle in binary mode and see if that clears up the problem. In order to do this, simply add the line
binmode CAT;
immediately preceding the first use of the handle. That should be somewhere in the neighborhood of line 110 I think (the code should be added before 'while (<CAT>)').
Jim
On Monday, June 30, 2003, at 12:43 PM, Dan Muey wrote:
When indexing pdf's, I accasionally get:
Malformed UTF-8 character (unexpected continuation byte 0xad, with no preceding start byte) in substitution ...
Always blamed on line 113 and 117 in pdf2html.pl
Which is in pdf_body()
1930:1758:5:http://www.marcinciso.com/cassens/master/ctc-1448.pdf: size = 9071
I assume the above line means all went well. But then I get this for lots of these pdf's:
1931:1901:5:http://www.marcinciso.com/cassens/master/ctc-1481.pdf: !! Malformed UTF-8 character (unexpected continuation byte 0xad, with no preceding start byte) in substitution (s///) at /home/dmuey/doc2html/pdf2html.pl line 117, <CAT> line 9.
!! Malformed UTF-8 character (unexpected continuation byte 0xad, with no preceding start byte) in substitution (s///) at /home/dmuey/doc2html/pdf2html.pl line 117, <CAT> line 10.
!! Malformed UTF-8 character (unexpected continuation byte 0xad, with no preceding start byte) in substitution (s///) at /home/dmuey/doc2html/pdf2html.pl line 117, <CAT> line 12.
size = 5891
And back to normal:
1932:1599:5:http://www.marcinciso.com/cassens/master/ctc-1014.pdf: size = 8400
So is this just a warning or does it mean it's not able to index it very well/at all?
What could cause it and how could I remedy it?
Thanks
Dan
-------------------------------------------------------
This SF.Net email sponsored by: Free pre-built ASP.NET sites including
Data Reports, E-commerce, Portals, and Forums are available now.
Download today and enter to win an XBOX or Visual Studio .NET.
http://aspnet.click-url.com/go/psa00100006ave/direct;at.asp_061203_01/ 01
_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html
------------------------------------------------------- This SF.Net email sponsored by: Free pre-built ASP.NET sites including Data Reports, E-commerce, Portals, and Forums are available now. Download today and enter to win an XBOX or Visual Studio .NET. http://aspnet.click-url.com/go/psa00100006ave/direct;at.asp_061203_01/01 _______________________________________________ htdig-general mailing list <[EMAIL PROTECTED]> To unsubscribe, send a message to <[EMAIL PROTECTED]> with a subject of unsubscribe FAQ: http://htdig.sourceforge.net/FAQ.html

