Hello,

I’m using Apache Tika to parse different documents formats in my application. I 
created custom CSV Parser that has to be configured during runtime. I tried to 
add this custom CSV parser by getting parsers map from the DefaultParser, 
adding my CSV parser to this map and setting it into DefaultParser.

```
Default Parser parser = TikaConfig.getDefaultConfig().getParser();
Map<MediaType,Parser> parsers = parser.getParsers(context);
CsvParserWrapper csvParser = new CsvParserWrapper(settings, csvHeaders, csvHh);
parsers.put(MediaType.text("csv"), csvParser);
parser.setParsers(parsers);
```
Unfortunately, setting parsers this way changes the ordering of parsers and 
number of parsers in the DefaultParser (duplicates I think). After this I 
started to receive errors, because i.e. content-type of the parsed html file 
(which is http://lucene.apache.org/solr/solrnews.html) is recognized as 
„application/xml” and parsed by DcXMLParser. 

So my qestions are:
1) How can I add a custom parser to the DefaultParser at a runtime. I can’t 
initialize it at the startup, because it has to be configurable.
2) How can I get Parsers list from DefaultParser where parsers are in the same 
order as in the DefaultParser?

Kind regards,
Karol Abramczyk

Reply via email to