[ https://issues.apache.org/jira/browse/TIKA-605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jukka Zitting updated TIKA-605: ------------------------------- Attachment: 0001-TIKA-605-Tika-GDAL-parser.patch I guess ideally we should ask the GDAL toolkit to support parsing just an InputStream. But until that happens, the attached patch implements a simple mechanism by which a parser can provide a default file name suffix to use by TikaInputStream.getFile(). The relevant parser code would be something like this: {code} File file = tis.getFile(metadata.get(Metadata.RESOURCE_NAME_KEY)); {code} or: {code} File file = tis.getFile("pattern.pdf"); {code} > Tika GDAL parser > ---------------- > > Key: TIKA-605 > URL: https://issues.apache.org/jira/browse/TIKA-605 > Project: Tika > Issue Type: New Feature > Components: parser > Environment: indep. of env. > Reporter: Chris A. Mattmann > Assignee: Chris A. Mattmann > Labels: gdal, integration, tika > Fix For: 1.0 > > Attachments: 0001-TIKA-605-Tika-GDAL-parser.patch, > TIKA-605.Mattmann.092511.patch.txt > > > Leverage the GDAL toolkit and its Java SWIG bindings to create a Tika parser > around GDAL. See here: > http://trac.osgeo.org/gdal/browser/trunk/gdal/swig/java/apps/gdalinfo.java -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira