Hi Nick,
thanks for your reply.
Am 16.02.12 16:51, schrieb Nick Burch:
On Tue, 14 Feb 2012, Stephan Mühlstrasser wrote:
https://issues.apache.org/jira/browse/TIKA-527
...
The problem is that using the proposed method does not work for me.
Any use of the configuration file apparently sends Tika into an
endless recursion, even without overriding a built-in parser in the
configuration file.
Are you able to produce a unit test that shows the problem?
That's what I was trying to provide with the example in my previous message:
If I understand it correctly, the following configuration file should
have the same effect as the built-in configuration:
$ cat tika-config.xml
<properties>
<parsers>
<parser class="org.apache.tika.parser.DefaultParser"/>
</parsers>
</properties>
If you invoke the Tika CLI application with this configuration file, the
error happens. Just start it like this: "java
-Dtika.config=tika-config.xml -jar tika-app-1.0.jar --list-parsers" and
the error will happen.
Ah, I'm not sure that's correct. I think you also need to give a
mimetypes and a detector. Looking at lines 145 to 172 of TikaConfig, it
seems that you either get the defaults with no config, or specify them
all with your own config
Ok, I see now in the source what you mean. Then the example in TIKA-527
is not complete, as it does not have mimetypes and a detector.
In the meantime since yesterday I got my override working by packaging a
META-INF/services/org.apache.tika.parser.Parser into the JAR file
together with my parser. So I don't need the configuration file approach
anymore. But I think it still could be considered a bug if an
incorrect/insufficient configuration file sends Tika into an endless
recursion instead of producing a meaningful error message.
Thanks
Stephan
--
_______________________________________________________________
Stephan Mühlstrasser [email protected] www.pdflib.com
PDFlib GmbH, Franziska-Bilek-Weg 9, 80339 München, Germany
Court of registry/Amtsgericht München HRB 129497
Managing Directors/Geschäftsführer: Thomas Merz, Petra Porst
---------------------------------------------------------------
PDFlib: powerful toolkits for PDF developers since 1997
_______ See www.pdflib.com/products for product details________