I should have thought of this!
After
$Name = $URL;
$Name =~ s#^.*/##;
add
$Name =~ s/%([A-F0-9][A-F0-9])/pack("C", hex($1))/gie;
--
David Adams
Computing Services
Southampton University
----- Original Message -----
From: "Marcus Valentine" <[EMAIL PROTECTED]>
To: "htdig-general" <[EMAIL PROTECTED]>
Sent: Friday, July 06, 2001 3:04 PM
Subject: [htdig] Getting doc2html.pl to strip out %20s
> I'm using htdig on linux to index an intranet served by a windows machine.
> Many of the documents have a space in the file names.
>
> When you run doc2html at the command line with something like
>
> doc2html.pl file%20name.doc application/msword
http://myserver/file%20name.doc
>
> the resulting html contains
>
> <TITLE>[file%20name.doc]</TITLE>
>
> The title tag is then used by htsearch to display the heading for that
hit.
>
> Is possible to modify doc2html.pl so that it returns
>
> <TITLE>[file name.doc]</TITLE>
>
> instead, for ease of readability of the search results? Similarly other
> characters could be de-webbified.
>
> Marcus Valentine
>
>
> _______________________________________________
> htdig-general mailing list <[EMAIL PROTECTED]>
> To unsubscribe, send a message to
<[EMAIL PROTECTED]> with a subject of unsubscribe
> FAQ: http://htdig.sourceforge.net/FAQ.html
>
_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html