On 8/14/2002 at 11:45 AM Gilles Detillieux <[EMAIL PROTECTED]> wrote:

[snip]
>> !!   Error: Copying of text from this document is not allowed.
>> !!   Error (0): PDF file is damaged - attempting to reconstruct xref table...
>> !!   Error: Top-level pages object is wrong type (null)
>> !!   Error: Couldn't read page catalog

[snip]
>If you run "htdig -v", you'll see what URL it's working on at the time
>the error occurs.  You may not even have to do that, though.  Just search
>for the biggest PDF you have, and set max_doc_size to something larger
>than its size.  The error above is because the PDF was truncated.


Thank you for the excellent suggestions.  As it turned out, the problem PDF file was 
corrupted rather than oversized.

It was a little tricky locating the file because the error message goes to stderr, 
while the "-v" output goes to stdout, so the error message doesn't necessarily appear 
at the correct place in the output.  To match the error message to the correct 
filename, I redirected stderr to stdout with...

htdig -i -s -v > log.txt 2>&1

I was then able to look at log.txt and see exactly which file was causing the trouble.

Thanks again,


Greg McCann






-------------------------------------------------------
This sf.net email is sponsored by: Dice - The leading online job board
for high-tech professionals. Search and apply for tech jobs today!
http://seeker.dice.com/seeker.epl?rel_code1
_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a 
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html

Reply via email to