Adding document format libraries as subprojects of Tika still "hides"
them somewhat. So this wouldn't really solve the problem of easily
finding such libraries. If new libraries should be developed, I would
think that a lab or Commons is better suited.

There were many talks over the years about creating an image library
inside the ASF but it has never developed into a real effort. It's a lot
of work and with ImageIO built into the JDK only exotic wishes are still
open.

If we had a Tika Wiki we could at least list potential existing libraries
and libraries that we'd like but don't exist. We could list licenses,
candidates for incubation, quality/maturity indicators...

Inside the XML Graphics project, we have the following available (if
anyone is interested to know):
* XMP metadata framework in XML Graphics Commons, read/write, work in
progress
* PostScript DSC in XML Graphics Commons, read/write (no PS interpreter!)
* PNG and TIFF codecs in XML Graphics Commons, read/write
* PDF in FOP, write only
* RTF in FOP, write only
* SVG in Batik, read/write

Others:
PDF (PDFBox @SourceForge), read/write, signalled interest for incubation

personal wishlist:
ODF, read/write
Mars, read/write

On 10.07.2007 09:18:33 Carsten Ziegeler wrote:
> Afaik there is currently no central place at Apache where
> libraries/frameworks for handling of specific document formats are
> developed. We have single projects like poi of course.
> 
> If you are searching for java libraries which support a specific format,
> like some image formats, you'll find many libraries of varying quality
> and it's really hard (if not impossible) to choose a correct one.
> 
> I'm wondering if something could be done about it by starting a project
> at Apache which supports various file formats (like images, mp3 etc.) -
> perhaps by incubating some existing stuff.
> 
> Although Tika is more the framework for plugin in such stuff, it perhaps
> makes sense to try to start something like that as sub projects of Tika?
> 
> WDYT?
> 
> Carsten
> -- 
> Carsten Ziegeler
> [EMAIL PROTECTED]
> 


Jeremias Maerki

Reply via email to