Thanks Bob took care of 6 for ya: https://wiki.apache.org/tika/ContributorsGroup
I should be able to review this, but not going to be complete review for a few weeks.. thanks for your great work ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: [email protected] WWW: http://sunset.usc.edu/~mattmann/ ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ -----Original Message----- From: Bob Paulin <[email protected]> Reply-To: "[email protected]" <[email protected]> Date: Tuesday, January 5, 2016 at 7:54 PM To: "[email protected]" <[email protected]> Subject: Tika 2.0 Modules first pass. >All, > >I took a stab at the initial module structure based on Tim and my email >[1]. If a package didn't seem to fit with anything else I created an >individual project for it. If any of the groupings don't make sense or >folks think there are better ways to organize I'm happy to move stuff >around. Patches are welcome :). I have a JIRA created [2]. Commited >with rev 1723223. > >There's still a good amount of outstanding work: >1) All this could use more testing. Especially with the external parsers. >2) As Tim has already raised there is the issue of dual maintaining >branches. There are likely some fixes in trunk that are not currently >applied to the 2.0 branch. >3) The tika-parser project is currently using the maven shade plugin and >that is causing issues creating the OSGi Manifest.MF file. I should be >able to find a way around this. >4) Still need to recreate the OSGi uber jar with all dependencies >packaged with the tika code. >5) There are still some classes in the tika-parser project. Should >these all be moved to core? A common project?... >6) Documentation. I could use some Wiki access. Username: BobPaulin. >7) There are some dependencies in the tika-parser project that were not >needed to compile any of the individual modules or run tests. Are they >still needed? >8) Where does the >org.apache.tika.parser.external.CompositeExternalParser ServiceLoader >(META-INF/services/org.apache.tika.parser.Parser) config belong. I >moved it to tika-core since that is where the class lives. >9) Subcomponent licenses. I moved them to the modules they belong in >but I need to figure out a way to make them bubble up to the uber jars. >Or perhaps they need to be dual maintained. >10) Anything I may be forgetting....;) > >For the most part all the changes just to organize the existing >packages. There are a handful of changes to the test suite in order to >break some cyclical dependencies. Here's an overview of how the >projects interrelate at the moment: > >tika-parser-modules > - /tika-advanced-module > - /tika-cad-module > -> tika-text-module [test] > - /tika-code-module > -> tika-text-module [test] > - /tika-database-module > -> tika-office-module [test] > - /tika-ebook-module > -> tika-text-module > - /tika-journal-module > -> tika-pdf-module > - /tika-multimedia-module > -> tika-web-module [test] > -> tika-office-module [test] > -> tika-pdf-module [test] > - /tika-office-module > -> tika-web-module [test] > -> tika-package-module [test] > -> tika-text-module [test] > - /tika-package-module > - /tika-pdf-module > -> tika-text-module [test] > -> tika-package-module [test] > -> tika-office-module [test] > - /tika-scientific-module > -> tika-text-module [test] > - /tika-text-module > -/tika-web-module > -> tika-text-module [test] > -> tika-package-module [test] > >Very interested in feedback since we have been talking about this for a >bit but I'm sure actually seeing it will create more discussion. Looking >at how much simpler the individual pom files does seem to demonstrate >that this will be a good thing for the project. > >Cheers, > >- Bob > >[1] >http://mail-archives.apache.org/mod_mbox/tika-dev/201508.mbox/%3C55CF4C19. >6050503%40bobpaulin.com%3E >[2] https://issues.apache.org/jira/browse/TIKA-1824
