Dear Wiki user, You have subscribed to a wiki page or wiki category on "Tika Wiki" for change notification.
The "Tika2_0RoadMap" page has been changed by TimothyAllison: https://wiki.apache.org/tika/Tika2_0RoadMap?action=diff&rev1=4&rev2=5 = Background = This page is intended for a discussion of changes anticipated in Tika 2.0. - This is only a first draft from one voice. Please contribute! + This is only a first draft initially from one voice. Please contribute! = Major Planned Changes = @@ -18, +18 @@ * Allow users to build composite parsers with configurable strategies via the config file ([[https://issues.apache.org/jira/browse/TIKA-1509|TIKA-1509]] and CompositeParserDiscussion). We will be working towards this gradually through Tika 1.8 and 1.9. By Tika 2.0, however, this will be the default. + * Allow for easily configurable parser sub-packages. The tika-app, tika-server and tika-bundle jars are now pushing or are > 50MB. It would be great if users easily could specify a subset of parsers they care about, either a la carte or by category (image, common office files (MSOffice, PDF, etc.), environmental data) and only get the dependencies required for that subset of parsers. + - * Move to Java 1.7 (???) + * Move to Java 1.8 (???) + + * Solve the complex metadata challenge; see: [[https://issues.apache.org/jira/browse/TIKA-1607|TIKA-1607]] and [[https://issues.apache.org/jira/browse/TIKA-1691|TIKA-1691]] and [[http://mail-archives.apache.org/mod_mbox/incubator-tika-dev/201510.mbox/%[email protected]%3e|ISO 19115 discussion]] .... Or at least come to some accommodation that will allow for both easy key/values access and more advanced access for those who know what they're doing. = Minor Planned Changes = = Wishes = - * Allow for easily configurable parser sub-packages. The tika-app, tika-server and tika-bundle jars are now pushing or are > 30MB. It would be great if users easily could specify a subset of parsers they care about, either a la carte or by category (image, common office files (MSOffice, PDF, etc.), environmental data) and only get the dependencies required for that subset of parsers. +
