Hi Ben;
Actually I think I did update PDFBox. I will put it back to the version I
previously had.
Luke
- Original Message -
From: Ben Litchfield [EMAIL PROTECTED]
To: Lucene Users List [EMAIL PROTECTED]
Sent: Thursday, December 02, 2004 8:20 PM
Subject: Re: PDF Indexing Error
Hello All;
Perhaps this should be on the PDFBox forum but I was curious if anyone has
seen this error parsing PDF documents using packages other than PDFBox.
/usr/tomcat/fb_hub/GM/Administration/Document/java/java_io.pdf
java.io.IOException: You do not have permission to extract text
The weird
Hi,
I have a PDF Parser which uses PDFBox libary to parse PDF documents into
plain text.
I have tried this parser by sending the output directly to the
commandline and it works, I
get the plain text, like I get it with my HTMLParser.
But there is a problem with the indexing, I think:
I can
Siegfried,
My guess is that the '.' is accidental and there is nothing special
about '.'. I've used PDFBox+Lucene and it worked well. Are you aware
of Lucene-specific classes included in the PDFBox distribution? You
may be able to use those classes, or you could at least look at their
source,
Hi
I have written one files for PDF Indexing. Here I have written as follows ..
This is my IndexPDF file.
import org.apache.lucene.analysis.standard.StandardAnalyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.index.IndexReader;
import
You need to add log4j to your classpath:
http://logging.apache.org/log4j/docs/
sv
On 24 Aug 2004, sivalingam T wrote:
Hi
I have written one files for PDF Indexing. Here I have written as follows ..
This is my IndexPDF file.
import org.apache.lucene.analysis.standard.StandardAnalyzer
Don Vaillancourt wrote:
I used the following code example from an article that I linked off of
jakarta's site to index PDF files:
doc.add(Field.Text(content, new FileReader(f)));
But I realized today that this method only indexes the PDF as is. For
those wondering if the the PDF were actually