Giovanni Usai created TIKA-1843:
-----------------------------------
Summary: Tika parser for SEG-Y files and new MIME type
application/segy
Key: TIKA-1843
URL: https://issues.apache.org/jira/browse/TIKA-1843
Project: Tika
Issue Type: New Feature
Components: mime, parser
Reporter: Giovanni Usai
Priority: Minor
This ticket refers to the parsing of SEG-Y files (extensions .seg, .segy and
.sgy).
The SEG-Y format is used to store seismic data, you can find more information
here http://pubs.usgs.gov/of/2001/of01-326/HTML/FILEFORM.HTM.
I have:
- added a new MIME type application/segy matching the file name extensions
.segy, .seg and .sgy.
- created a new SEGYParser, matching that MIME type.
In order to parse the SEG-Y files, I am using a modified version of the sigrun
code (available under Apache license, here
https://github.com/mikhail-aksenov/sigrun). Notably I have done a fix and
changed some method signatures to be able to read from a ReadableByteChannel
instead of FileChannel.
For the moment I have put it directly into the new Tika's segy package. Is this
the right thing to do or should I reference it as external library thus
modifying the pom.xml?
Thanks and best regards,
Giovanni
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)