[
https://issues.apache.org/jira/browse/CONNECTORS-1428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16033487#comment-16033487
]
Karl Wright commented on CONNECTORS-1428:
-----------------------------------------
Hi [~julienFL], can you explain what this is trying to do?
{code}
public class TikaParser {
- private static Parser parser = new AutoDetectParser();
+ private static Parser parser = null;
+ private static String currentConfig = null;
private TikaParser() { }
+
+ public static synchronized void initParser(final String tikaConfig) {
+ if (!tikaConfig.equals(currentConfig)) {
+ InputStream is = new ByteArrayInputStream(tikaConfig.getBytes());
+ try {
+ TikaConfig conf = new TikaConfig(is);
+ parser = new AutoDetectParser(conf);
+ currentConfig = tikaConfig;
+ } catch (TikaException | IOException | SAXException e) {
+ parser = new AutoDetectParser();
+ }
+
+ Map<MediaType, Parser> parsers = ((AutoDetectParser)
parser).getParsers();
+ parsers.put(MediaType.APPLICATION_XML, new HtmlParser());
+ ((AutoDetectParser) parser).setParsers(parsers);
+ }
+ }
{code}
It looks like the TikaParser class needs to be rearranged to allow
configurability. I'll take care of that if that's the intent.
> Allow tika config parameter
> ---------------------------
>
> Key: CONNECTORS-1428
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1428
> Project: ManifoldCF
> Issue Type: Wish
> Components: Tika extractor
> Affects Versions: ManifoldCF 2.7
> Reporter: Julien Massiera
> Assignee: Karl Wright
> Priority: Minor
> Fix For: ManifoldCF 2.8
>
> Attachments: CONNECTORS-1428.patch, CONNECTORS-1428v2.patch
>
>
> It would be nice to have an option to pass a tika config file to the
> connector through the UI.
> The connector would load it in the "TikaParser" class like :
> private static Parser parser = new AutoDetectParser(new TikaConfig(new
> File("path/to/file")));
> This is just an example of course, it has to be done properly
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)