Hi all,
Is these any way please to set up a custom Extractor in Jackrabbit 2.
What i ended up to guess is that JR2 uses Tika for all text extracting but
it does not give a way to specify textFilterClasses as in previous versions.
When looking into JackrabbitParser.java (in JR implementation), i found a
fuzzy :
new AutoDetectParser(new
TikaConfig(JackrabbitParser.class.getResourceAsStream("tika-config.xml")))
which closes all possibilities to handle custom extractors.
further more only for backwork compatibility textFilterClasses values (in
workspace.xml) are handled for solely for "APACHE implemeted classes" and
does a simple
logger.warn("Ignoring unknown text extractor class: {}", name);
for all the rest.
Thanks for the help.
Taha Ben Salah