RE: PDFBox PDFExtractor

Rod.Madden Mon, 12 Sep 2005 09:07:39 -0700

Thanks for reply Jeroen ...does anyone have any
experience / comments regarding the use of PDFTextStream
versus PDFExtractor for working with PDF files ...the
issue for us is that there appears to be very high
memory usage when we work with PDF's using PDFExtractor.


I have heard that PDFTextStream may be a better solution.

Rod

-----Original Message-----
From: Jeroen Reijn [mailto:[EMAIL PROTECTED] 
Sent: Monday, September 12, 2005 11:58 AM
To: [email protected]
Subject: Re: PDFBox PDFExtractor

Hi Rod,

PDFBox is a seperate project. The PDFExtractor in Jakarta Slide uses
PDFBox's 
functionality to extract the information from the .pdf file.

Hope this answers your question.

Jeroen


[EMAIL PROTECTED] wrote:
> Hi,
> 
>  
> 
> I am new to Lucene and looking at some existing Lucene code....
> 
>  
> 
> I am confused about the relationship ( if any ) between 
> 
> org.apache.slide.extractor.PDFExtractor methods and org.PDFBox.cos
> methods
> 
> for the purposes of working with PDF files.
> 
>  
> 
> I have found info on the web regarding PDFBox, however, I have found
> little
> 
> regarding .PDFExtractor.
> 
>  
> 
> I am curious since we are having some issues with indexing PDF files
and
> 
> I am wondering if PDFExtractor implements PDFBox or if it is a
separate 
> 
> utility set.
> 
>  
> 
> Rod.
> 
>  
> 
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

RE: PDFBox PDFExtractor

Reply via email to