Hi, Yes, imagemagik is quite useful. I'll try using that.
I am wondering, are leptonica and imagemagik equivalent libraries - if not same? Can tesseract with either of them? Thanks. On Friday, September 7, 2012 10:27:42 PM UTC+8, sventech wrote: > > Use the ImageMagick library. > --Sven > > On Fri, Sep 7, 2012 at 2:23 AM, newtotesseract > <[email protected]<javascript:>> > wrote: > > Hi Rob, > > > > Yes, fax2tiff could be one way. > > But actually, I'm extracting the CCITTFaxDecode stream data from the > PDFs > > and trying to extract OCR text from them. > > So, I am trying to do this conversion all in memory instead of writing > to > > files. > > > > thanks > > > > On Friday, September 7, 2012 12:52:12 PM UTC+8, rkomar wrote: > >> > >> On Thu, 6 Sep 2012, newtotesseract wrote: > >> > >> > Hi Nick, > >> > I tried passing in the CCITTFaxDecode data to tesseract, > >> > but it was not detected as TIFF. > >> > > >> > It seems like CCITT fax is not same as TIFF. > >> > > >> > Google search showed me that few other people also faced > >> > same issue > >> > (e.g."http://stackoverflow.com/questions/2641770/extracting-im > >> > age-from-pdf-with-ccittfaxdecode-filter"). > >> > > >> > If you know, how we can convert the CCITT-Fax to tiff or > >> > jpeg, it would be really helpful. > >> > > >> > Many thanks for your help and time. > >> > > >> > Thanks, > >> > - ganesh > >> > >> TIFF files can contain many kinds of image data compressed > >> with all sorts of types of compression. CCITT _is_ one > >> of the supported compression types. If you can install > >> ImageMagick, then you can use the 'convert' program in that > >> package to create your TIFF file. For example: > >> > >> > convert in.fax -compress Group4 out.tif > >> > >> converts the file to TIFF using CCITT Group4 compression > >> in the output. > >> > >> Or, if you have libtiff installed, then you can use > >> the fax2tiff program to do the conversion. > >> > >> Don't convert to jpeg; it isn't meant for bi-level images. > >> > >> Cheers, > >> Rob Komar > > > > -- > > You received this message because you are subscribed to the Google > > Groups "tesseract-ocr" group. > > To post to this group, send email to > > [email protected]<javascript:> > > To unsubscribe from this group, send email to > > [email protected] <javascript:> > > For more options, visit this group at > > http://groups.google.com/group/tesseract-ocr?hl=en > > > > -- > ``All that is gold does not glitter, > not all those who wander are lost; > the old that is strong does not wither, > deep roots are not reached by the frost. > From the ashes a fire shall be woken, > a light from the shadows shall spring; > renewed shall be blade that was broken, > the crownless again shall be king.” > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en

