I don't know the answer to this, but someone in the htdig mailing list probably does.
 
David Adams
 
----- Original Message -----
Sent: Tuesday, May 18, 2004 12:17 PM
Subject: PDF indexing problem.

Dear Mr. Adams,

            I am a newbie in Linux world, but I managed to familiarize my self as if working on Windows as much I can.

The problem is after indexing the website, and while searching I get the PDF header as output like:

 

%PDF-1.3 %âãÏÓ 134 0 obj << /Linearized 1 /O 136 /H [ 1629 900 ] /L 1773217 /E 744863 /N 8 /T 1770418 >> endobj xref 134 60 0000000016 00000 n 0000001551 00000 n 0000002529 00000 n 0000002703 00000 n 0000003149 00000 n 0000003205 00000 n 0000003551 00000 n 0000004093 00000 n 0000004145 00000 n 0000004197

 

And I got these errors after running rundig (13 lines):

 

!!      Malformed UTF-8 character (overflow at 0x860e94e4, byte 0x9d, after start byte 0xbf) in substitution (s///) at /opt/htdig/scripts/doc2html.pl line 503,
<FILE> line 4.

 

I am using:

1)       HT://DIG 3.2.0-16 with the web ‘plug-in’

2)       RH 8.0

3)       Doc2html which calls pdf2html

 

I searched the Internet for a solution, but I couldn’t find any!

Could you please assist me to solve this problem?

 

Thanking you,

Regards,

Mohammed Ahmed.

 

 

Reply via email to