Thanks, that is how our link looks. I didn't realize htDig remembers the referring 
link and uses it as part of the search database.

--
Terry Luedtke
National Library of Medicine

On Thu, Apr 19, 2001 at 09:20:07AM +0100, David Adams wrote:
> Doc2html (latest versions) extracts the Subject, Title, and Keywords from
> PDF files if they have them,
> but your document doesn't.
> 
> Most likely you give a relatively high score to the "Description", and you
> have indexed a page with a link like:
> 
> <A href="http://profiles.nlm.nih.gov/BB/A/A/A/A/_/bbaaaa.pdf">What I would
> Like</a>
> 
> --
> David Adams
> Computing Services
> Southampton University
> 
> 
> ----- Original Message -----
> From: "Terry Luedtke" <[EMAIL PROTECTED]>
> To: <[EMAIL PROTECTED]>
> Sent: Wednesday, April 18, 2001 9:49 PM
> Subject: [htdig] How does it find this pdf file?
> 
> 
> >
> > Hello,
> >
> > We have a search that returns a PDF file as the best hit. But the PDF file
> is an image, not text, so I don't know how htDig is finding it. I have
> customers who want to know how it does so they can repeat it. We are using
> doc2html and pdftotext 0.91. When I run the file through doc2html all I get
> is gibberish. The search is
> >
> >
> http://wwwindex.nlm.nih.gov/cgi/htsearch?config=www_exact;method=or;format=b
> uiltin-long;words=what%20would%20like;page=1
> >
> > and the PDF file is the first link (bbaaaa.pdf).
> >
> > Any explanations on how this file gets indexed? (I'd love to tell them
> htDig has OCR, but they wouldn't believe it.)
> >
> > Thanks,
> > Terry Luedtke
> > National Library of Medicine
> >
> >
> > _______________________________________________
> > htdig-general mailing list <[EMAIL PROTECTED]>
> > To unsubscribe, send a message to
> <[EMAIL PROTECTED]> with a subject of unsubscribe
> > FAQ: http://htdig.sourceforge.net/FAQ.html
> >

_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a 
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html

Reply via email to