oh, thanks for the better solution. On Wed, Mar 9, 2011 at 4:36 PM, Uwe Schindler <[email protected]> wrote:
> Hi, > > > > These are all bugs in Apache TIKA not Solr, some of them are already fixed > in later TIKA versions (so you may try the soon-to-be-released Solr 3.1 > version which contains a newer TIKA bundled). > > > > Uwe > > > > ----- > > Uwe Schindler > > H.-H.-Meier-Allee 63, D-28213 Bremen > > http://www.thetaphi.de > > eMail: [email protected] > > > > *From:* Deepak Singh [mailto:[email protected]] > *Sent:* Wednesday, March 09, 2011 12:03 PM > *To:* [email protected] > *Subject:* Re: Solr Exception > > > > > *HTTP ERROR :500 (INTERNAL SERVER ERROR)* > > *For DOC files:* > org.apache.tika.exception. > > TikaException : > -Unexpected RuntimeException from > org.apache.tika.parser.microsoft.OfficeParser@1248f2 > Caused by: org.apache.poi.hpsf.IllegalPropertySetDataException: The > property set claims to have a size of 16 bytes. However, it exceeds 16 > bytes. > > -TIKA-198: Illegal IOException from > org.apache.tika.parser.microsoft.OfficeParser@1248f2 > Caused by: java.io.IOException: block[ 0 ] already removed - does your > POIFS have circular or duplicate block references? > > > *For PDF files:* > org.apache.tika.exception.TikaException : > -Unexpected RuntimeException from org.apache.tika.parser.Pdfparser@1b4cd65 > Caused by: java.lang.ClassCastException: org.pdfbox.cos.COSArray cannot be > cast to org.pdfbox.cos.COSDictionar > Caused by: java.lang.NullPointerException > > > > -Unable to extract PDF content > > *HTTP ERROR:400 (BAD REQUEST)* > -This error come when some fields are missing > ERROR:unknown field 'language' (Ex:content_status, description,version) > > > > On Wed, Mar 9, 2011 at 4:19 PM, Gora Mohanty <[email protected]> wrote: > > Hi, > > This is probably better directed to the user list. Also, please provide > details of the exceptions from your log files. > > Regards, > Gora > > >
