Hello,
I’m using Apache Tika to parse different documents formats in my application. I
created custom CSV Parser that has to be configured during runtime. I tried to
add this custom CSV parser by getting parsers map from the DefaultParser,
adding my CSV parser to this map and setting it into DefaultParser.
```
Default Parser parser = TikaConfig.getDefaultConfig().getParser();
Map<MediaType,Parser> parsers = parser.getParsers(context);
CsvParserWrapper csvParser = new CsvParserWrapper(settings, csvHeaders, csvHh);
parsers.put(MediaType.text("csv"), csvParser);
parser.setParsers(parsers);
```
Unfortunately, setting parsers this way changes the ordering of parsers and
number of parsers in the DefaultParser (duplicates I think). After this I
started to receive errors, because i.e. content-type of the parsed html file
(which is http://lucene.apache.org/solr/solrnews.html) is recognized as
„application/xml” and parsed by DcXMLParser.
So my qestions are:
1) How can I add a custom parser to the DefaultParser at a runtime. I can’t
initialize it at the startup, because it has to be configurable.
2) How can I get Parsers list from DefaultParser where parsers are in the same
order as in the DefaultParser?
Kind regards,
Karol Abramczyk