RE: Does the Lucene search engine work with PDF's?

2003-10-20 Thread MOYSE Gilles (Cetelem)
You can also use the TextMining.org toolbox, which provides classes to extract text from PDF and DOC files, using the Jakarta POI project. They are all free, under Apache Licence. The URL :http://www.textmining.org/modules.php?op=modloadname=Newsfile=articlesid =6mode=threadorder=0thold=0). (URL

Re: Does the Lucene search engine work with PDF's?

2003-10-17 Thread Ben Litchfield
You need to be able to extract the text from them and feed that to lucene. http://ww.pdfbox.org can extract text from pdf documents. Ben On Fri, 17 Oct 2003, Andre Hughes wrote: Hello, Can the Lucene search engine index and search though PDF documents? What are the file format limits for