nono,
these are "plain" pdf files, mostly converted from winword. so there is a
lot of text. when i use pdf2text or pdftohtml and look in the result, i get
all the words/text from the pdf file. so something different happens here...

mfg

Markus Rietzler
* <rietzler_software/>
* RZF NRW
* Tel: 0211.4572-130



-----Urspr�ngliche Nachricht-----
Von: Gregory Kozlovsky [mailto:[EMAIL PROTECTED]]
Gesendet am: Mittwoch, 11. September 2002 10:07
An: '[EMAIL PROTECTED]'
Betreff: RE: [aseek-users] external converters, pdf files

Sometimes, what appears to be text in .pdf files is actually scanned images
that cannot be indexed. Check for it.

    Gregory Kozlovsky

-----Original Message-----
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED]]
Sent: Mittwoch, 11. September 2002 09:59
To: [EMAIL PROTECTED]
Subject: [aseek-users] external converters, pdf files


hi,
i am trying to setup aspseek with external converter support. i installed
pdftohtml, indexing works fine, pdf files seem to be processed, i can find
the urls to the pdf files in urlword table even with status code 200. but
when i do a search with words from the pdf-files i get no result, pdf files
were not listet in the results...

any idea?

thanxs

mfg

Markus Rietzler
* <rietzler_software/>
* RZF NRW
* Tel: 0211.4572-130



-----Urspr�ngliche Nachricht-----
Von: Charlie Farinella [mailto:[EMAIL PROTECTED]]
Gesendet am: Dienstag, 10. September 2002 23:35
An: [EMAIL PROTECTED]
Betreff: [aseek-users] selective removal of urls

Is there a way to selectively remove a url from our database after it
has been indexed?  We would like to remove porn sites from a family
friendly database.

-- 
------------------------------------------------------------------------
Charlie Farinella, Appropriate Solutions, Inc.
[EMAIL PROTECTED]
603-924-6079

Reply via email to