Mac files?

Allison, Timothy B. Wed, 23 Jul 2014 11:22:29 -0700

All,

  Over on Tika, it looks like we copied 
org.apache.pdfbox.examples.pdmodel.ExtractEmbeddedFiles to extract embedded 
files.  As I look at the source code for PDComplexFileSpecification, I notice 
that getEmbeddedFile() does not behave like getFilename(); that is, it doesn't 
iterate through the various formats and return the first non null.


  When we try to get the PDEmbeddedFile, should we try each of these instead of 
just getEmbeddedFile()?



getEmbeddedFile()

getEmbeddedFileDos()

getEmbeddedFileUnix()

getEmbeddedFileMac()



  Will getEmbeddedFile() alone potentially miss embedded files?



   Thank you.



         Best,



                    Tim

extracting embedded documents -- will getEmbeddedFile() alone miss embedded DOS/Unix/Mac files?

Reply via email to