Hi,

I'd like to propose a new Tika App for the 2.0 branch. One of the reasons we broke apart the Tika parsers into modules was due to the complexity of having to deal with all the parser dependencies and transitive dependencies. Now developers can use just the modules they want without pulling the kitchen sink with it. Unfortunately this approach doesn't simplify the problem in the tika-parser or tika-app project where the whole kitchen sink comes together again. This is a difficult problem but I think it's one that the Apache Felix [1] project has done a good job solving. I've described the approach and provided an implementation in my github [2] please see the README for details. I'd like to get a sense from the community if this is a direction we'd like to go in since it involves bring in another stack. If we want to move this this direction I'm happy to move it into the tika 2.0 branch. I think this approach opens the door for some cool features like plugins and will allow the modules to upgrade more aggressively due to less pressure to matchup the dependencies.

I've created a JIRA [3]. I'm happy to take feedback there or on this thread.


- Bob


[1] http://felix.apache.org/

[2] https://github.com/bobpaulin/tika-app-osgi

[3] https://issues.apache.org/jira/browse/TIKA-2076

Reply via email to