So I just tried adding a META-INF/services/org.apache.tika.parser.Parser file to each bundle in the straw man implementation and it seemed to do the trick. Looks like the ServiceLoader code searches the classloader for all of these files and iterates through them to pick up each jar's META-INF/services/org.apache.tika.parser.Parser entries and adds them to the list. I've updated the code on github to include one per bundle. This might be the way to go.

ex.
https://github.com/bobpaulin/tika/tree/trunk/tika-parser-bundles/tika-image-parser-bundle/src/main/resources/META-INF/services


- Bob

On 8/3/2015 9:21 PM, Allison, Timothy B. wrote:
+1 to moving the source to bundles.  I think for a 2.0 would be easier
to consolidate into a parser uber jar than trying to tease things out
like I did in the straw man impl. However deciding how to break things
up might take some experimentation.

Y, and the strawman is a great easy entry down this path towards 2.0.  I think 
the main hangup will be coming to consensus about granularity and nature of the 
packages, but we can burn that bridge when we get to it.  There are some 
dependencies between parsers, but we can work through that.

1) To spin up the GUI you need org.apache.tika.parser.util (perhaps
consider moving this up to core).
Y, I put that in tika-parsers because it relies on commons codec, and I wanted 
to keep that dependency out of tika-core.  But, I'm willing to add it to 
tika-core if there aren't objections.


2) Since the META-INF/services/org.apache.tika.parser.Parser is in
tika-parser we'd need to rethink the static ServiceLoader strategy to
either always be dynamic or figure out a way to have each jar bring
there own static loader.

Hmmm...is there a way to specify this in one overall tika-config file or in 
separate configs in each bundle (yuck)...


Reply via email to