Hi,

I recently endavoured into an htDig installation adventure and am
currently stimified w.r.t. differences in behaviour between two installs
on mostly similar servers. I have everything up & running OK on a RH7.2
machine (Perl 5.6), and also did an install on a Redhat 8 machine, which
comes with Perl 5.8. I'm quite explicit about these Perl versions, since
a) I tried to do my homework and believe both machines have been setup
similarly, factoring out simple installation errors, b) commandline
invocations of the different tools (antiword, xlhtml, ppthtml and
pdftotext) work flawlessly on both machines, while c) invocation of either
doc2html and pdf2html don't work correctly on the RH8 machine with the
newer version of Perl.

I'm not in the position to swap versions of Perl on the RH8 machine, but
am seeking confirmations that this new Perl version could effectively be
the possible cause of my installation nightmare.

One example of strange behaviour:

./doc2html.pl "/var/www/htdigtest/xxe_installatie.pdf" "application/pdf" 
url

results in pdf gibberish being injected into the HTML:

<HTML>
<HEAD>
<TITLE>[url]</TITLE>
</HEAD>
<BODY>
<PRE>
%PDF-1.3
%????
4 0 obj
&lt;&lt; /Type /Info
/Producer (FOP 0.20.4) &gt;&gt;
endobj
5 0 obj
&lt;&lt; /Length 1830 /Filter [ /ASCII85Decode /FlateDecode ]
....


while as ./pdf2html.pl "/var/www/htdigtest/xxe_installatie.pdf" 
"application/pdf" url

works somehow better, still injecting error codes into the html output 
however:

<HTML>
<HEAD>
<TITLE>[url]</TITLE>
</HEAD>
<BODY>
Client-installatie XXE
<p>
by 
<p>
1. Download
<p>
Warning:
<br>
het te downloaden bestand hangt af van uw besturingssysteem. Voor 
Windows-gebaseerde systemen, download het ZIP
<br>
bestand, voor Linux: download het TAR.GZ bestand
Malformed UTF-8 character (unexpected continuation byte 0xb7, with no 
preceding start byte) in substitution (s///) at ./pdf2html.pl line 117, 
<CAT> line 15.
Malformed UTF-8 character (unexpected continuation byte 0xb7, with no 
preceding start byte) in substitution (s///) at ./pdf2html.pl line 117, 
<CAT> line 15.

The other utilities however simply don't get invocated by doc2html,
resulting in UNABLE TO CONVERT error messages.

I'm utterly clueless w.r.t. Perl, so any guidance would be very much 
welcomed.

Cheers,

</Steven>
-- 
Steven Noels                            http://outerthought.org/
Outerthought - Open Source, Java & XML Competence Support Center
Read my weblog at            http://blogs.cocoondev.org/stevenn/
stevenn at outerthought.org                stevenn at apache.org



-------------------------------------------------------
This SF.net email is sponsored by: ValueWeb: 
Dedicated Hosting for just $79/mo with 500 GB of bandwidth! 
No other company gives more support or power for your dedicated server
http://click.atdmt.com/AFF/go/sdnxxaff00300020aff/direct/01/
_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html

Reply via email to