Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Tika Wiki" for change 
notification.

The "PDFParser (Apache PDFBox)" page has been changed by TimothyAllison:
https://wiki.apache.org/tika/PDFParser%20%28Apache%20PDFBox%29?action=diff&rev1=7&rev2=8

  </properties>
  }}}
  
+ == Optional Dependencies ==
+ If you need to process TIFF or JPEG2000 images within PDFs, please consider 
adding the optional dependencies specified by 
[[https://pdfbox.apache.org/2.0/dependencies.html#optional-components||PDFBox]].
  These dependencies are not compatible with ASL 2.0;  please make sure that 
any third party licenses are suitable for your project.
+ 
+ Finally, [[https://twitter.com/mcaruanagalizia/status/796097425446490114|M. 
Caruana Galizia]] alerted us to the need to use maven-shade's 
ServicesResourceTransformer because the third-party dependencies' services file 
will be overwritten unless you do transform the services.  See an example: 
[[https://github.com/ICIJ/extract/blob/master/pom.xml|here]].
  
  == OCR ==
  Note: the configuration of some of these features via the config file 
requires a nightly build of Tika after 11/8/2016 or Tika version >= 1.15.

Reply via email to