You do not say whether you tested doc2html.pl. Execute it from the command
line with a *.doc file:
/opt/www/htdig/doc2html.pl wordfile.doc application/msword
similarly for a*.pdf file.
David Adams
Corporate Information Services
Information Systems Services
University of Southampton
----- Original Message -----
From: "Julius Lienemann" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Thursday, October 16, 2003 10:34 AM
Subject: [htdig] Indexing Word- and PDF-Files
> Hello Alltogether!
>
> At the momment I'm trying to set up a documentation-server for my
> exams-project. I've configured htdig, that it can index regular
html-sites,
> but for a documentation-server it at least must index *.pdf- an
*.doc-files.
> At the moment I'm a bit upset, cause I tried very much. I'm working on a
> Solaris 8 machine with htdig 3.1.6, Perl 5.1.something. In my htdig.conf I
> configured the points as listed in the doc2html-DETAILS file:
>
> external_parsers application/rtf->text/html /opt/www/htdig/doc2html.pl \
> (...and the other lines for other applications)
>
> In the doc2html.pl I wrote the path to my Perl:
>
> /usr/bin/perl
> use strict;
>
> I installed catdoc and wrote my $CATDOC = '/usr/local/bin' in the
> doc2html.pl-file. Even I made the configuration for *.pdf-files and
changed
> the pdf2html.pl-file as said in the DETAILS. I've installes xpdf in
> /usr/local/bin.
>
> When I want to index my files I run htdig - I cannot see any mistake, it
just
> reads some files from my Apache-Server (2.0.47). Then I run htmerge to
build
> the databases and index my files. It starts with sorting... , then there
are
> some lines like "htmerge: Removing doc #10", it starts merging, after that
> some lines with "htmerge: discarding dokus (or other directory-names) in
doc
> #4". Then it says the url of my server and in the next line: "Deleted, no
> excerpt: and a link to a file following."
>
> It seems that htmerge does not index any *.doc or *.pdf-files, but there
are
> many in these directorys. If I search with htsearch in browser or command
> line, htsearch cannot find anything - cause there was nothing indexed... I
> would thank any help, cause for my project it's very important that this
> works!
>
> Best Regards!
>
> Julius Lienemann
>
>
>
> -------------------------------------------------------
> This SF.net email is sponsored by: SF.net Giveback Program.
> SourceForge.net hosts over 70,000 Open Source Projects.
> See the people who have HELPED US provide better services:
> Click here: http://sourceforge.net/supporters.php
> _______________________________________________
> ht://Dig general mailing list: <[EMAIL PROTECTED]>
> ht://Dig FAQ: http://htdig.sourceforge.net/FAQ.html
> List information (subscribe/unsubscribe, etc.)
> https://lists.sourceforge.net/lists/listinfo/htdig-general
>
-------------------------------------------------------
This SF.net email is sponsored by: SF.net Giveback Program.
SourceForge.net hosts over 70,000 Open Source Projects.
See the people who have HELPED US provide better services:
Click here: http://sourceforge.net/supporters.php
_______________________________________________
ht://Dig general mailing list: <[EMAIL PROTECTED]>
ht://Dig FAQ: http://htdig.sourceforge.net/FAQ.html
List information (subscribe/unsubscribe, etc.)
https://lists.sourceforge.net/lists/listinfo/htdig-general