> Finally, a while back, somebody on this list mentioned quiet a
> different approach: simply read the raw binary document and go fishing
> for what looks like text. I would like to try that :)

I have tried that approach and it works ok. You end up with a bunch of junk
in with the useful stuff. It can clutter up your index and make searching
slower. There are a lot of file formats that don't store all of the text as
sequential text so it won't work. PDF is one, I know that PowerPoint is
another.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to