David, is it failing on some particular file or always, never mind what
goes on?
POI hints that there is illegal offset, that probably is a cause of the
error.

--Oleg



On Wed, Dec 12, 2012 at 4:31 PM, David Morana (JIRA) <[email protected]>wrote:

>
>     [
> https://issues.apache.org/jira/browse/TIKA-1041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13529984#comment-13529984]
>
> David Morana commented on TIKA-1041:
> ------------------------------------
>
> after some research, I upgraded the POI jars to 3.9 ( I was at v3.8 beta)
> but no luck I'm still getting the error above
>
> > Tika 1.2 universalcharset errors
> > --------------------------------
> >
> >                 Key: TIKA-1041
> >                 URL: https://issues.apache.org/jira/browse/TIKA-1041
> >             Project: Tika
> >          Issue Type: Bug
> >    Affects Versions: 1.2
> >         Environment: I'm running solr 4.0 with tika 1.2 on tomcat 7.0.8
> with manifoldcf v1.1dev
> >            Reporter: David Morana
> >             Fix For: 1.2, 1.3
> >
> >
> > This is somewhat confusing and frustrating. I successfully crawled
> Opentext using all of the above. then I recrawled and it aborted almost
> immediately.
> > It choked on images, so I excluded them for now.
> > but now it's choking on txt files!
> > sometimes I get this error
> > SEVERE: null:java.lang.RuntimeException: java.lang.NoClassDefFoundError:
> org/mozilla/universalchardet/CharsetListener
> > and sometimes I get this one
> > SEVERE: null:java.lang.RuntimeException: java.lang.NoClassDefFoundError:
> org/apache/tika/parser/txt/UniversalEncodingListener
>
> --
> This message is automatically generated by JIRA.
> If you think it was sent incorrectly, please contact your JIRA
> administrators
> For more information on JIRA, see: http://www.atlassian.com/software/jira
>

Reply via email to