I'm glad that doc2html works OK for you.

In Perl 
        $WP2HTML = "";
and
        $WP2HTML = '';
are equivalent.

On Thu, 26 Oct 2000 9:23:37 +0200 [EMAIL PROTECTED] wrote:

> Thanks for your help!
> Your tool works perfectly especially with German Umlaute. The description in the 
>Details-File was very helpfull, so it was no problem for one who has no experience 
>with perl to use doc2html.
> But there is one little annotation for the Details-File. In the install description 
>you write: If you don't have a particular utility then set its location as a null 
>string.  For example:
> $WP2HTML = '';
> 
> I don't know but I think you mean $WP2HTML = "";     or?
> 
> 
> Christian Huhn
> 
> >>> <[EMAIL PROTECTED]> 25.10.2000  15.41 Uhr >>>
> > > Hi,
> > > I want to index PDF-Files with German Umlaute (�, �, �, �). Some tests had shown 
>me that htdig (v. 3.1.5) and xpdf (v. 0.91) are working pretty good with German 
>Umlaute, but the external parser parse_doc.pl has problems with them. It splits words 
>with Umlaute in two words without the Umlaut.
> > For example:
> > > w       beim    41      0
> > w       diesj   45      0
> > w       hrigen  50      0
> > w       den     58      0
> > w       Platz   62      0
> > > In this case the German word "diesj�hrigen" is split in "diesj" and "hrigen" and 
>I can find both with htsearch.
> > > Does anyone know how to solve this problem for example with a modified version 
>of parse_doc.pl?
> > > Thanks,
> > > Christian Huhn
> >
> 
> You could try the doc2html parser.  I think that the latest version,
> available from the Ht://Dig web site, will not split words this way, but
> I have not tested it thoroughly.
> 
> If doc2html does not parse your .PDF files properly, then email an
> example to me personally, and I'll make sure that the next version of
> doc2html works correctly.
> 
> --  David J Adams
> <[EMAIL PROTECTED]>
> Computing Services
> University of Southampton
> 
> 
> 

----------------------
David Adams
[EMAIL PROTECTED]


------------------------------------
To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.
List archives:  <http://www.htdig.org/mail/menu.html>
FAQ:            <http://www.htdig.org/FAQ.html>

Reply via email to