If you are using version 3.0 of doc2html, then script doc2html.pl calls a
separate script, pdf2html.pl, to do the conversion.  This script in turn
calls pdfinfo and pdftotext from the xpdf package.

1)    Try using both pdfinfo and pdftotext from the command line.
        If they don't work then either you hav'nt installed them properly or
there is
        something wrong with your .PDF file.

2)    Call pdf2html.pl from the command line:

        <full path name>pdf2html.pl  <pathname to .PDF file>

    to check that it is producing HTML output.

3)    If it does not work then check your installation of pdftotext, and
your tailoring of pdf2html.pl.

4)    Else if it does work then try doc2html.pl from the command line:

        <full path name>doc2html.pl <pathname to .PDF file> application/pdf
Name

Hope that helps.

--
David Adams
Computing Services
Southampton University


----- Original Message -----
From: "Marcus Valentine" <[EMAIL PROTECTED]>
To: "David Adams" <[EMAIL PROTECTED]>;
<[EMAIL PROTECTED]>
Sent: Tuesday, June 19, 2001 11:29 AM
Subject: Re: [htdig] doc2html.pl version 3 problems under windows NT


> At 09:48 19/06/01 +0100, David Adams wrote:
> >Ok, I should not have assumed you were running some form of Unix.
> >
> >If you have success with:
> >
> >set TMPDIR=d//
> >./htdig -i -c ../conf/htdig.conf
> >
> >Try
> >
> >set DOC2HTML_LOG
> >set TMPDIR=d//
> >./htdig -i -c ../conf/htdig.conf
> >
> >which should increase the information you get out of doc2html.pl
> >
> >Did you have to modify doc2html to get it to work under DOS?  If so I
would
> >like to know
> >what changes you made so that they can go into the next release.
>
> I didn't realise doc2html.pl hadn't been tested under DOS.  I have got it
> to work at the DOS command line under the following conditions:
>
> 1. Must use cygwin perl (#!e:/cygwin/bin/perl.exe -w).  Activeware perl
> doesn't work as the script doesn't find the input pdf file for some
reason.
>
> 2. I hardcoded $TMP = "d:/tmp"; for certainty
>
> 3. I hardcoded $LOG = "d:/conv.log"; and $Verbose = 1;
>
> But still when it's invoked from htdig nothing appears in the log, and the
> pdfs aren't indexed.
>
> There appear to be a number of lines in the script that windows may have
> problems with, e.g.
>
> $RM = "/bin/rm -f";
>
> Marcus Valentine
>
> >If you are using the C-shell then that would become:
> >
> >setenv DOC2HTML_LOG
> >setenv TMPDIR=d//
> >./htdig -i -c ../conf/htdig.conf
> >
> >Whereas Bourne Shell and (I think) Bash it would be:
> >
> >DOC2HTML_LOG=""
> >TMPDIR=d//
> >export DOC2HTML_LOG, TMPDIR
> >./htdig -i -c ../conf/htdig.conf
> >
> >
> >--
> >David Adams
> >Computing Services
> >Southampton University
> >
> >
> >----- Original Message -----
> >From: "Marcus Valentine" <[EMAIL PROTECTED]>
> >To: "David Adams" <[EMAIL PROTECTED]>;
> ><[EMAIL PROTECTED]>
> >Sent: Tuesday, June 19, 2001 9:36 AM
> >Subject: Re: [htdig] doc2html.pl version 3 problems under windows NT
> >
> >
> >> At 09:10 19/06/01 +0100, David Adams wrote:
> >> >Doc2html.pl is not giving you any error messages, so it seems to be
> >working.
> >> >
> >> >Add
> >> >
> >> >DOC2HTML_LOG = ""
> >> >export DOC2HTML_LOG
> >> >
> >> >to the (Bourne shell) script that runs htdig and doc2html.pl will
output
> >a
> >> >line for every file indexed,
> >> >which will include the number of bytes extracted and sent back to
htdig.
> >>
> >> Sorry - I'm not with you.  Presently I'm running htdig from the dos
> >command
> >> line.  Are you saying I should create a script file and run it from
within
> >> cygwin at the bash prompt? I tried
> >>
> >> DOC2HTML_LOG = ""
> >> ./htdig -i -c ../conf/htdig.conf
> >> export DOC2HTML_LOG
> >>
> >> but the "can't open file /tmp/htdext.???" problem recurs, which I fixed
by
> >> running htdig from the dos command line and setting TMPDIR=d//
> >>
> >> Thanks
> >>
> >> >--
> >> >David Adams
> >> >Computing Services
> >> >Southampton University
>
>


_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a 
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html

Reply via email to