wrote:
COM based parser:
http://www.intrinsyc.com/products/enterprise_applications.asp
convert word to text: http://www.winfield.demon.nl/index.html
That's a bit expensive... I found a free alternative - Jawin, plus OLE
Automation.
--
Best regards,
Andrzej Bialecki
You may want to think about using POI from Jakarta
http://jakarta.apache.org/poi
Clemens
- Original Message -
From: Pinky Iyer [EMAIL PROTECTED]
To: Lucene Users List [EMAIL PROTECTED]
Sent: Friday, February 28, 2003 9:44 PM
Subject: Word doc parser
Anybody knows of a good word
Kelvin,
Have you had a chance to check in any of your search subsystem
components? I know it's been a while since I mentioned the issue, but
I'd love to make some headway on a solid Turbine search subsystem for
general consumption.
Thanks,
Seth
-Original Message-
From: Kelvin Tan
The first query is always slow because it includes time taken to load the
index. Index loading time is a function of archive size, meaning the larger
the archive the longer the load time. However search time is more a function
of number of search terms, meaning if your archive only contains 100