it is not about input, but output. pdf output is key feature of leptonica 1.71 release (and tesseract 3.03/3.04) and I guess it was not tested on cygwin yet.
Zdenko On Fri, Jul 24, 2015 at 8:42 AM, Simon Eigeldinger <[email protected] > wrote: > Hi, > > i never tried to give tesseract a pdf as an input. > cygwin has leptonica 1.71 or 1.72 by default so i used this for compiling. > maybe leptonica doesn't like pdf files so it might complain. > so ShreeDevi Kumar might convert the pdf into an image or he uses a normal > image (tif, jpg, etc.). > > > greetings, > simon > > > > > Am 24.07.2015 um 08:17 schrieb zdenko podobny: > >> On Fri, Jul 24, 2015 at 7:10 AM, ShreeDevi Kumar <[email protected]> >> wrote: >> >> >>> C:\Users\User\Downloads\TESS>tesseract test/eurotext.tif >>> test/eurotext-eng-pdf -l eng pdf >>> Tesseract Open Source OCR Engine v3.04.00 with Leptonica >>> Page 1 >>> Error in fopenWriteStream: stream not opened >>> Error in pixWrite: stream not opened >>> Error in fopenReadStream: file not found >>> Error in extractG4DataFromFile: stream not opened to file >>> Error in l_generateG4Data: datacomp not extracted >>> Error in pixGenerateCIData: g4 data not made >>> Error in l_generateCIDataForPdf: file test/eurotext.tif format is 4; >>> unreadable >>> Error during processing. >>> >>> It looks like leptonica issue. Did you try to build and run leptonica >>> progs (all that has pdf in name)? >>> >>> >> >> >> Zdenko >> >> > -- > Simon Eigeldinger > Follow me on Twitter: http://www.twitter.com/domasofan/ > E-Mail: [email protected] > MSN: [email protected] > ICQ: 121823966 > Jabber: [email protected] > > -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To post to this group, send email to [email protected]. > Visit this group at http://groups.google.com/group/tesseract-ocr. > To view this discussion on the web visit > https://groups.google.com/d/msgid/tesseract-ocr/55B1DE3B.1090906%40vol.at. > > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAJbzG8yxh%2Buh-D%2BN16_duXiAadOyuxkrQxODJCWknQ7G0-O7Gw%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.

