I have posted a similar query on the pdftohtml list
 
I'm attempting to crawl portions of the web with aspseek.  Html output is working fine a is very stable.  I have configured pdftohtml as a converter.  It indexes most pdf's fine,  so I don't think its a config problem, but crashes the crawl on some.  when I download the file and try it command line it works fine.  I'm currently running on the latest sources from cvs,  having first tried 1.2.6 and 1.2.10.  aspseek log output is as follows:
 
( 2 20 20 182 12 29  7 20) Adding URL: http://www.lsic.com/fin/annual01.pdf
exec /usr/bin/pdftohtml -i -noframes -stdout /tmp/asi5dQRXA >/tmp/asoXjR7TX
Address of param: ba072d20
Address of param: ba07a560
 
 
all 20 threads then crash.
 
Just started using pdftohtml yesterday.  do I need different params

Reply via email to