On 10/8/15 8:45 AM, Nick Burch wrote: > Hi All > > TL;DR - There's a handful of Java mini-projects, one per file > format, each with a library and command-line tools, in and around > Apache Tika. Would Commons be a good Apache home for them? > > > Apache Tika, for those who don't know, is a toolkit for detecting > file types, then extracting consistent structured metadata and > content. It wraps a whole bunch of other Java libraries, and hides > all the complexity from users. > > In a few cases, there hasn't been a suitably licensed / available > library for a format that Tika wanted to support, so we've ended > up having to write our own. As part of an experiment, some of > these are in the Tika codebase, and some are hosted externally. A > few of them are generally useful, in particular the Ogg and the > MP3 ones. > > For the formats where the support code is in Tika, we're not > seeing any re-use beyond Tika. The code is embedded in the Tika > Parsers jar, and no-one would think to look in there for some > generic file format code. Nor would you really expect to find it > in Tika anyway, even if it had its own jar. For the Ogg code, > which we've tried hosting on Github, there has been some re-use of > the code. There hasn't been all that much visibility though, and > releasing without the Apache infrastructure can be a bit of a > pain, plus one single person needs to take charge of the project. > > For Ogg, as well as the Java library code, there's the Tika plugin > code, and command line tools. No audio encoding/decoding yet, but > much of the work is there if someone wanted to finish it off. > We're considering adding a SAS7BDAT library to this little > grouping shortly too, which as well as being used by Apache Tika, > would also be used by Apache Metamodel, possibly some others too, > and would have command line tools. > > > Following some discussions last week at ApacheCon / Apache Big > Data / ApacheCon BarCamp on this, it was suggested we try asking > here if you think these could have a good home in Apache Commons? > On the one hand, they are in Java, and are re-usable. On the > other, they have command line tool packages as well, which doesn't > seem that commons-like, ditto the multimedia encoding/decoding > parts which are nearly there. > > What do you all think? Could Commons be a suitable home for them? > Or should we look elsewhere? (We do have a backup idea if needed)
This depends on what you mean by "home." What does not work is just parking code here and hoping someone else picks it up. What can work fine is moving some code here and working on it and building community around it here. There just needs to be a micro-community interested and willing to generate interest in the code and maintain it. If this is the case, then you all are most welcome to join us. Phil > > Thanks > Nick > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > > --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
