I just installed the 3.0.1 version of tesseract (used the Windows installer for 3.0 and then added the zipped 3.0.1 to the directory.) Only the english training file is present, for now. I then tested tesseract using the phototest.tif file in the doc subdir and it worked just fine. (Admin privileges were set.)
(I'm running on Windows 7 Professional, 64-bit, on a Lenovo T510 laptop.) I also installed ImageMagick 6.7.2-Q16 using their installer. I then converted a PDF article into eight .tif page files using it. All that worked okay and the images look correct to me. To do that, I used the following command: convert -density 150 -depth 8 -colorspace gray -verbose pic32.PDF p %02d.tif This produced the p00.tif to p07.tif files without exhibiting an error and, as I said, they appeared to display fine using Windows Live Photo Gallery, for example. However, tesseract 3.0.1 crashes (Windows wants to look up possible solutions before killing the program) on any or all of these .tif files that were produced. I have placed the first two files at my web site at: http://www.infinitefactors.org/misc/images/tesseract/p00.tif http://www.infinitefactors.org/misc/images/tesseract/p01.tif (These files are each about 4 megabyte in size. The directory listing is disabled and only the two listed above are world readable, in a modest attempt to protect the copyright holder and focus on this problem I'm having.) I'm not sure if I need to change the ImageMagick conversion settings, as all of this is pretty new to me. (First time out.) It's possible that if I convert the PDF using different settings more to the liking of tesseract that I'd have better results. I will attempt a few changes on my own, mostly at random because of my profound ignorance, but I'm looking for helpful thoughts in the meantime. It's my hope to eventually learn how to convert PDF files that are huge scans of old documents I have from the large PDF file format into more compressed versions where the text is converted well and the PDF is much shorter and searchable, as well. But that's long term. For now, I'd just like to figure out how to make these tif pages work. Thanks in advance. And I apologize for my ignorance. Jon -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en

