"Read 8192 from document Read 8192 from document Read 8192 from
> document Read 8192 from document Read 8192 from document Read 2048 from
> document Read a total of 43008 bytes"

is part of the diagnostic output from htdig itself.  If this appearing in
the "excerpt" shown by htsearch then you must now have set up htdig and
doc2html.pl in a monumentally weird way beyond my comprehension.

As for the doc2html.pl file, etc. which you emailed earlier I havn't yet
found any error except that you are using wp2html to convert .RTF files.  I
may be wrong, but I did not think it had that capability.

Have you succeeded in running doc2html.pl from the command line?   The
format is:

/export/home/htdig-3.1.6/scripts/doc2html/doc2html.pl
/fullpathname/worddocument.doc "application/msword"
http://www.wherever/worddocument.doc

where only the third argument is optional, and the second argument must be
exactly "application/msword".

--
David Adams
Computing Services
Southampton University

----- Original Message -----
From: "Wendt, Trevor" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Cc: "'Gilles Detillieux'" <[EMAIL PROTECTED]>
Sent: Monday, September 09, 2002 5:17 PM
Subject: RE: [htdig] htdig & wp2html problems


> I'm not entirely sure what changed but now it is indexing the word
documents
> (sort of) with no errors.  If I do a search for * all the information in
the
> database is returned.  I can see the word document URLs and what should be
> excerpts for each.
>
> For each document referenced it gives "No Page Title Found" and something
> like this: "Read 8192 from document Read 8192 from document Read 8192 from
> document Read 8192 from document Read 8192 from document Read 2048 from
> document Read a total of 43008 bytes" for the excerpt. When I do a search
> for a specific word found within one of the word documents, my search
> returns no results. It looks like it is finding the file and noting the
file
> size and what not but it is not parsing the document.
>
> Anyone have suggestions on this problem?
>
> -Trevor
>
>
>
> -----Original Message-----
> From: Gilles Detillieux [mailto:[EMAIL PROTECTED]]
> Sent: Friday, September 06, 2002 3:03 PM
> To: Wendt, Trevor
> Cc: [EMAIL PROTECTED]
> Subject: Re: [htdig] htdig & wp2html problems
>
>
> According to Wendt, Trevor:
> > Word Doc:
> > $od -b /export/home/htdig-3.1.6/scripts/doc2html/IntranetROI.doc |
head -1
> > 0000000 320 317 021 340 241 261 032 341 000 000 000 000 000 000 000 000
> >
> > Looks like the magic numbers match when it's on the local box (which is
> > solaris) but the file itself is located on an NT/IIS 4.0 box. I didn't
> think
> > that would cause a problem but for kicks I downloaded hod, a nice little
> > octal dump program for windows, and the dump output matches on NT as
well.
>
> >
> > Since the Word RTF is working, here's the od output from it.
> > RTF Doc:
> > $ od -b /export/home/htdig-3.1.6/scripts/doc2html/IntranetROI_wo*.doc |
> head
> > -1
> > 0000000 173 134 162 164 146 061 134 141 156 163 151 134 141 156 163 151
> >
> > As of now, I have not modified anything in my doc2html.pl file since my
> last
> > email.
> >
> > Any other ideas? I do appreciate all the help!
>
> I'm afraid I'm stumped.  Hopefully David or someone else more familiar
> with doc2html than me can think of something I've misseed.
>
> Do you get exactly the same error, and no other potentially useful error
> messages, when you run doc2html.pl manually from the command line on one
> of these Word documents?
>
> --
> Gilles R. Detillieux              E-mail: <[EMAIL PROTECTED]>
> Spinal Cord Research Centre       WWW:    http://www.scrc.umanitoba.ca/
> Dept. Physiology, U. of Manitoba  Winnipeg, MB  R3E 3J7  (Canada)
>
>
> -------------------------------------------------------
> This sf.net email is sponsored by: OSDN - Tired of that same old
> cell phone?  Get a new here for FREE!
> https://www.inphonic.com/r.asp?r=sourceforge1&refcode1=vs3390
> _______________________________________________
> htdig-general mailing list <[EMAIL PROTECTED]>
> To unsubscribe, send a message to
<[EMAIL PROTECTED]> with a subject of unsubscribe
> FAQ: http://htdig.sourceforge.net/FAQ.html
>



-------------------------------------------------------
This sf.net email is sponsored by: OSDN - Tired of that same old
cell phone?  Get a new here for FREE!
https://www.inphonic.com/r.asp?r=sourceforge1&refcode1=vs3390
_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a 
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html

Reply via email to