Hi Chris, I think your plan to improve the netCDF and HDF parsing is a great one. The richness of a full ncdump of netCDF metadata and a full ncdump HDF-EOS metadata would be an excellent addition to the 1.0 release of Tika. I have discussed Tika to several science data user and they usually ask about netCDF and HDF-EOS metadata capabilities. A GDAL parser is also a great idea.
Thanks, Steve On Fri, May 20, 2011 at 12:22 PM, Mattmann, Chris A (388J) < [email protected]> wrote: > Hey Jukka et al., > > > It's a few months since 0.9 and our Tika in Action book is soon ready > > for print, so I think it's good time to start planning for the 1.0 > > release. > > Looking forward to not writing anything for a while :-) I doubt it'll > happen knowing how things go, but also really really happy with where the > book is (and banging on those last revisions! :-) ). > > > > > There are a few odds and ends that I'd still like to sort out in the > > trunk, but overall I think we're in a pretty much ready for the switch > > from 0.x to 1.x. > > +1. > > > > > One major issue to be decided is whether we want to follow up with the > > earlier intention of dropping deprecated functionality (like the > > three-argument parse() method) before the 1.0 release. > > +1, I'd be fine with this. I'm a fan of following through on things that we > say we're going to do if for no other good reason than we said we're going > to do it. > > +1 to dropping the 3 arg parse method. > > > I think we > > should do that and also make some other backwards-incompatible > > cleanups while we're at it. That way we'll have less old baggage to > > carry as we evolve through the 1.x release cycle. > > +1, my biggest thing to work on is improving the NetCDF and HDF parsing, > adding an ODL parser (I'll create an issue for this), adding some spatial > parsers (working on the GDAL one right now), and maybe some documentation on > how to use the science data file formats. I should have time over the next > month or so to complete these. > > > > > Another thing to think about is whether we want to do a formal Apache > > press release about Tika reaching 1.0 status. > > +1. I'd be happy to work with Jukka, as Nick suggested, to draft this, and > then from there to work with Sally to make it happen. > > Thanks! > > Cheers, > Chris > > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > Chris Mattmann, Ph.D. > Senior Computer Scientist > NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA > Office: 171-266B, Mailstop: 171-246 > Email: [email protected] > WWW: http://sunset.usc.edu/~mattmann/ > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > Adjunct Assistant Professor, Computer Science Department > University of Southern California, Los Angeles, CA 90089 USA > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > >
