Hi,
These are all bugs in Apache TIKA not Solr, some of them are already fixed in later TIKA versions (so you may try the soon-to-be-released Solr 3.1 version which contains a newer TIKA bundled). Uwe ----- Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen <http://www.thetaphi.de/> http://www.thetaphi.de eMail: [email protected] From: Deepak Singh [mailto:[email protected]] Sent: Wednesday, March 09, 2011 12:03 PM To: [email protected] Subject: Re: Solr Exception HTTP ERROR :500 (INTERNAL SERVER ERROR) For DOC files: org.apache.tika.exception. TikaException : -Unexpected RuntimeException from org.apache.tika.parser.microsoft.OfficeParser@1248f2 Caused by: org.apache.poi.hpsf.IllegalPropertySetDataException: The property set claims to have a size of 16 bytes. However, it exceeds 16 bytes. -TIKA-198: Illegal IOException from org.apache.tika.parser.microsoft.OfficeParser@1248f2 Caused by: java.io.IOException: block[ 0 ] already removed - does your POIFS have circular or duplicate block references? For PDF files: org.apache.tika.exception.TikaException : -Unexpected RuntimeException from org.apache.tika.parser.Pdfparser@1b4cd65 Caused by: java.lang.ClassCastException: org.pdfbox.cos.COSArray cannot be cast to org.pdfbox.cos.COSDictionar Caused by: java.lang.NullPointerException -Unable to extract PDF content HTTP ERROR:400 (BAD REQUEST) -This error come when some fields are missing ERROR:unknown field 'language' (Ex:content_status, description,version) On Wed, Mar 9, 2011 at 4:19 PM, Gora Mohanty <[email protected]> wrote: Hi, This is probably better directed to the user list. Also, please provide details of the exceptions from your log files. Regards, Gora
