I forgot to mention, the largest PDF document is 29Mb, I have set 
max_doc_size to 50Mb. The core file is indeed from pdftotext.
Any ideas on how to ignore these files.

>From: Gilles Detillieux <[EMAIL PROTECTED]>
>To: [EMAIL PROTECTED] (garsila Ndzmande)
>CC: [EMAIL PROTECTED] (ht://Dig mailing list)
>Subject: Re: [htdig] rundig fails during PDF indexing
>Date: Fri, 25 Jan 2002 15:33:46 -0600 (CST)
>
>According to garsila Ndzmande:
> > How can I prevent rundig from crashing and dumping core, due to Bad or
> > protected PDF files?
> >
> > Error (0): PDF file is damaged - attempting to reconstruct xref table...
> > Error: Kid object (page 4) is wrong type (null)
> > Error: Page count in top-level pages object is incorrect
> > Error: Couldn't read page catalog
> > Error: Unknown Type 0 character set: Adobe-Identity
> > Error (0): PDF file is damaged - attempting to reconstruct xref table...
> > Error: Dictionary key must be a name object
> > Error: End of file inside dictionary
> > Error: End of file inside dictionary
> > Error: font resource is not a dictionary
> > Error: Weird page contents
> > Error (49721): Illegal character ')'
>
>Set you max_doc_size appropriately.  See http://www.htdig.org/FAQ.html#q5.2
>
>This problem shouldn't cause htdig to dump core, although perhaps pdftotext
>might.
>
>--
>Gilles R. Detillieux              E-mail: <[EMAIL PROTECTED]>
>Spinal Cord Research Centre       WWW:    
>http://www.scrc.umanitoba.ca/~grdetil
>Dept. Physiology, U. of Manitoba  Phone:  (204)789-3766
>Winnipeg, MB  R3E 3J7  (Canada)   Fax:    (204)789-3930




_________________________________________________________________
MSN Photos is the easiest way to share and print your photos: 
http://photos.msn.com/support/worldwide.aspx


_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a 
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html

Reply via email to