So I just tried adding a META-INF/services/org.apache.tika.parser.Parser
file to each bundle in the straw man implementation and it seemed to do
the trick. Looks like the ServiceLoader code searches the classloader
for all of these files and iterates through them to pick up each jar's
META-INF/services/org.apache.tika.parser.Parser entries and adds them to
the list. I've updated the code on github to include one per bundle.
This might be the way to go.
ex.
https://github.com/bobpaulin/tika/tree/trunk/tika-parser-bundles/tika-image-parser-bundle/src/main/resources/META-INF/services
- Bob
On 8/3/2015 9:21 PM, Allison, Timothy B. wrote:
+1 to moving the source to bundles. I think for a 2.0 would be easier
to consolidate into a parser uber jar than trying to tease things out
like I did in the straw man impl. However deciding how to break things
up might take some experimentation.
Y, and the strawman is a great easy entry down this path towards 2.0. I think
the main hangup will be coming to consensus about granularity and nature of the
packages, but we can burn that bridge when we get to it. There are some
dependencies between parsers, but we can work through that.
1) To spin up the GUI you need org.apache.tika.parser.util (perhaps
consider moving this up to core).
Y, I put that in tika-parsers because it relies on commons codec, and I wanted
to keep that dependency out of tika-core. But, I'm willing to add it to
tika-core if there aren't objections.
2) Since the META-INF/services/org.apache.tika.parser.Parser is in
tika-parser we'd need to rethink the static ServiceLoader strategy to
either always be dynamic or figure out a way to have each jar bring
there own static loader.
Hmmm...is there a way to specify this in one overall tika-config file or in
separate configs in each bundle (yuck)...