[ https://issues.apache.org/jira/browse/TIKA-400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12856910#action_12856910 ]
Chris A. Mattmann commented on TIKA-400: ---------------------------------------- Hey Jukka: I think we do since the NetCDF lib relies on it. I agree with you on accessing internal resources. The problem is, this NetCDF library (which seems to be the most used/maintained from a Java perspective), expects to be responsible for handling the way content is delivered to it too. In fact, NetCDF and HDF concern themselves not only with obtaining data from a particular stream/content, but also, how that content is represented, because the data volumes are so large, they have to make optimizations in how to extract and represent the data for the purposes of access to it. So, I actually ran into something similar here in terms of e.g., the core abstraction for opening up a NetCdfFile in the lib is only a File as input -- it's really hard to pass it a stream, which is what Tika expects. Arg! Very frustrating indeed. I'll look around and see if there is another ASL friendly NetCDF Java library (does anyone else know of one?) Cheers, Chris > netCDF Tika Parser > ------------------ > > Key: TIKA-400 > URL: https://issues.apache.org/jira/browse/TIKA-400 > Project: Tika > Issue Type: New Feature > Components: parser > Environment: indep. of env. > Reporter: Chris A. Mattmann > Assignee: Chris A. Mattmann > Fix For: 0.8 > > > Along with TIKA-399, netCDF is also a widely used scientific data format. I'm > going to throw up a Tika parser that can deal with netCDF. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira