I'm glad that doc2html works OK for you.
In Perl
$WP2HTML = "";
and
$WP2HTML = '';
are equivalent.
On Thu, 26 Oct 2000 9:23:37 +0200 [EMAIL PROTECTED] wrote:
> Thanks for your help!
> Your tool works perfectly especially with German Umlaute. The description in the
>Details-File was very helpfull, so it was no problem for one who has no experience
>with perl to use doc2html.
> But there is one little annotation for the Details-File. In the install description
>you write: If you don't have a particular utility then set its location as a null
>string. For example:
> $WP2HTML = '';
>
> I don't know but I think you mean $WP2HTML = ""; or?
>
>
> Christian Huhn
>
> >>> <[EMAIL PROTECTED]> 25.10.2000 15.41 Uhr >>>
> > > Hi,
> > > I want to index PDF-Files with German Umlaute (�, �, �, �). Some tests had shown
>me that htdig (v. 3.1.5) and xpdf (v. 0.91) are working pretty good with German
>Umlaute, but the external parser parse_doc.pl has problems with them. It splits words
>with Umlaute in two words without the Umlaut.
> > For example:
> > > w beim 41 0
> > w diesj 45 0
> > w hrigen 50 0
> > w den 58 0
> > w Platz 62 0
> > > In this case the German word "diesj�hrigen" is split in "diesj" and "hrigen" and
>I can find both with htsearch.
> > > Does anyone know how to solve this problem for example with a modified version
>of parse_doc.pl?
> > > Thanks,
> > > Christian Huhn
> >
>
> You could try the doc2html parser. I think that the latest version,
> available from the Ht://Dig web site, will not split words this way, but
> I have not tested it thoroughly.
>
> If doc2html does not parse your .PDF files properly, then email an
> example to me personally, and I'll make sure that the next version of
> doc2html works correctly.
>
> -- David J Adams
> <[EMAIL PROTECTED]>
> Computing Services
> University of Southampton
>
>
>
----------------------
David Adams
[EMAIL PROTECTED]
------------------------------------
To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.
List archives: <http://www.htdig.org/mail/menu.html>
FAQ: <http://www.htdig.org/FAQ.html>