Alistair Oldfield created PDFBOX-5269:
-----------------------------------------

             Summary: Consider making LegacyPDFStreamEngine a public class
                 Key: PDFBOX-5269
                 URL: https://issues.apache.org/jira/browse/PDFBOX-5269
             Project: PDFBox
          Issue Type: Improvement
          Components: Text extraction
    Affects Versions: 2.0.24
            Reporter: Alistair Oldfield


Please consider making Please consider making LegacyPDFStreamEngine public.


This will allow extending the class. 

At the moment, one needs to copy the entire class sources into their own local 
version and making a public version of the copy if one wishes to extend it. 


This also in turn makes creating a local copy of PDFTextStripper necessary so 
it can inherit from the local copy of LegacyPDFStreamEngine.


One reason someone would want to extend it (my example):For my needs, I have 
had to change the implementation of:


    public void processPage(PDPage page):    

in my case I have had to change the implementation (this is particular to my 
needs, but hopefully highlights the usefulness, and why it would potentially be 
needed):           

 
{code:java}
try {        
   super.processPage(page);        
}
catch(MissingOperandException e) {        
    // we need to catch this, because it is acceptable, we will deal with this 
particular error by cleaning the PDF.        
    throw new PdfLoadingException(e.getMessage(), e);        
}
catch(Exception e) {    
    //we ignore all other errors and keep going because we are OK with that for 
our purposes.        
}{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to