> Your view is correct. The idea is to avoid direct parser class references in > jackrabbit-core and just rely on the service provider loader mechanism in > Tika to pick up all the available parsers. > > We also decided to move the tika-parsers dependency from jackrabbit-core to > deployment packages like jackrabbit-webapp and jackrabbit-standalone. This > should make it even easier for people to set up custom deployments with few > or no parser libraries.
that's brilliant, thanks for clarifying ... Regards, Kevin -- Kevin Jansz [email protected] Level 7, 10-16 Queen Street, Melbourne 3000 Australia Tel +61 3 9621 2773 | Fax +61 3 9621 2776 Exari Systems Boston | London | Melbourne | Munich www.exari.com Test drive our software online - www.exari.com/demo-trial.html Read our blog on document assembly - blog.exari.com On 10 March 2011 20:27, Jukka Zitting <[email protected]> wrote: > Hi, > > On 03/09/2011 04:51 AM, Kevin Jansz wrote: >> >> It's not a huge issue I guess as it seems with tika 0.9 (or 0.8.1?) >> the PDF parser issue will be resolved in which case I expect the >> code in org.apache.jackrabbit.core.query.pdf.* will disappear along >> with reference to it from the tika-config.xml. > > Yes, that's what we've already done in trunk. > >> I'm taking the time to mention it here in case it saves someone time >> and also to gauge if our view of lucene, tika and the parsers is >> incorrect - that future releases of jackrabbit may still include >> parsers other than DefaultParser and EmptyParser in it's >> tika-config.xml. > > Your view is correct. The idea is to avoid direct parser class references in > jackrabbit-core and just rely on the service provider loader mechanism in > Tika to pick up all the available parsers. > > We also decided to move the tika-parsers dependency from jackrabbit-core to > deployment packages like jackrabbit-webapp and jackrabbit-standalone. This > should make it even easier for people to set up custom deployments with few > or no parser libraries. > > -- > Jukka Zitting >
