Hi, I'm a developer who has used Apache Tika in a Research Data Repository System at The Australian National University. As part of the requirements of the project we extended the functionality of Apache Tika by creating a parser that extracts the headers of files in the FITS format (http://www.nationalarchives.gov.uk/PRONOM/Format/proFormatSearch.aspx?s tatus=detailReport&id=657) using the nom.tam.fits library available at http://heasarc.gsfc.nasa.gov/docs/heasarc/fits/java/v1.0/ .
Apache Tika already has the ability to identify FITS files (without parsing them) as per https://issues.apache.org/jira/browse/TIKA-874 . Is your team willing to review and potentially incorporate the parser into Tika? The parser in its current form is available at https://github.com/anu-doi/anudc/blob/master/DcShared/src/main/java/au/e du/anu/dcbag/metadata/FitsParser.java . Thank you, Rahul Khanna [email protected]
