On a side-note, we only have the classloading problems when running on
Java 5, Java 6 works just fine so it seems the implementation of the
Java XML-library has changed it's implementation-loading mechanism.
Also, forgot to include the link to HTMLParser, so here it is.

[1] htmlparser.sourceforge.net

> -----Original Message-----
> From: Daan de Wit [mailto:d.de....@wis.nl]
> Sent: maandag 23 maart 2009 10:42
> To: tika-dev@lucene.apache.org
> Subject: classloading problems with Xerces
> 
> Hi,
> 
> 
> 
> We tried to integrate Tika in our product instead of using our own
> parsing library, all goes well except for one problem. We use an OSGi
> environment, and the Xerces library used by NekoHTML is causing us
real
> problems with classloading. So we decided to ditch NekoHTML, and use
> HTMLParser [1] instead. HTMLParser's SAX implementation has some bugs
> though, so we sub-classed it in Tika's HtmlParser class. If there is
any
> interest, I can create a JIRA-issue and attach the patch there.
> 
> Another minor problem we encountered is that the tests can not be run
> without first copying the contents of src/main/resources to
> src/main/resources/org/apache/tika.
> 
> 
> 
> Daan

Reply via email to