Sorry, you were not saying this, I mixed some stuff up when reading up on the 
issue this morning, this was what I was referring to:

According irfanview, is compressed as - LZW tif file of 300 DPI   What Quan 
says is correct  image is heavily compressed tif one. Tesseract-OCR is 
supported only uncompressed tif file only from my experience.
Sriranga(78yrsold)
Thanks for pointing it out.
Mike

Von: zdenko podobny [mailto:[email protected]]
Gesendet: Montag, 28. März 2011 14:34
An: Lutz, Michael
Cc: Dmitri Silaev; [email protected]; Richard Genthner
Betreff: Re: tesseract.exe has stopped working on win2008 r2


On Mon, Mar 28, 2011 at 11:54 AM, Lutz, Michael 
<[email protected]<mailto:[email protected]>> wrote:
Hi All,

So the image Richard gave us is a compressed TIF file. Since tesseract only 
supports uncompressed TIF images as noticed by Zdenko you will not get any 
results from this image.

Incorrect:

 1.  image support is task of leptonica, so list of supported format can be 
found of leptonica web and source code. I think we really need to distinguish 
this, because with upgrading of leptonica there could be support for new format 
without changing a line in tesseract code.
 2.  I guessed that leptonica has problem with tiff with "lzw compression". 
When I created tiff with "zip compression" it worked (there are also other 
compression algorithms available in tiff: Packbits, G4, G3,...). I never said 
that leptonica (tesseract) support only uncompressed tiff. I am sorry if I was 
not clear about this.
 3.  As TP corrected me: problem is not in LZW compression, but in "Samples per 
Pixel". Leptonica support 1, 3, 4. Input image used (unsupported) 2. To "solve" 
this just open input file in InfranView and save it as tiff with lzw 
compression. It will change "Samples/Pixel" to 1 automatically ;-)
 Zdenko

I attached the image as an uncompressed TIF file, see uncompressed.zip, this 
image is processed by tesseract without any problems.
Also attached is a tesseract.zip, which should unpack a tesseract.executable, 
just rename it to tesseract.exe if it went through, it is a release static 
build using Win7 and WinSDK 7.1 if anyone still wants it.

Regards,
Mike

-----Ursprüngliche Nachricht-----
Von: Dmitri Silaev [mailto:[email protected]<mailto:[email protected]>]
Gesendet: Samstag, 26. März 2011 22:04
An: [email protected]<mailto:[email protected]>
Cc: zdenko podobny; Lutz, Michael; Richard Genthner
Betreff: Re: tesseract.exe has stopped working on win2008 r2

Guys, I still can't understand what the error is produced by
Tesseract. Let's wait for the error screenshot. Or did you understand
everything already? Richard says he's got an error message...

Warm regards,
Dmitri Silaev





On Sat, Mar 26, 2011 at 5:42 PM, zdenko podobny 
<[email protected]<mailto:[email protected]>> wrote:
>
>
> On Fri, Mar 25, 2011 at 5:40 PM, Lutz, Michael 
> <[email protected]<mailto:[email protected]>> wrote:
>>
>> Hi,
>>
>> I just ran your tif file, I get no results, it must have something to do
>> with the size of the image. If I try to run a portion of tiff something
>> smaller than 1000x1000 then I get results.
>>
>> Can somebody explain why a tif size (2480x3508 @ 8BPP) is not processed?
>
> This is not tesseract but leptonica issue (library used for image handling).
> When I run it on linux I got error message comming from leptonica (1.67 -> I
> did not try 1.68 on linux yet):
> Error in pixReadFromTiffStream: spp not in set {1,3,4}
> Error in pixReadStreamTiff: pix not read
> Error in pixReadTiff: pix not read
> On Windows leptonica "release version" library did not show error/warning
> messages because of compile option "NO_CONSOLE_IO"
> (see http://code.google.com/p/leptonica/issues/detail?id=42).
> It looks like leptonica did not support lzw compression for tiff (
> see http://www.leptonica.com/source/README.html  "9. Image I/O" - lzw is
> mentioned in png and gif section, but not with tif). I change
> tif compression from lzw to zip (BTW: this will cause smaller image),
> tesseract will produce ouput (on XP SP3).
> Zdenko
>
>> Mike
>>
>>
>>
>> Von: Richard Genthner 
>> [mailto:[email protected]<mailto:[email protected]>]
>> Gesendet: Freitag, 25. März 2011 17:04
>> An: Lutz, Michael
>> Cc: [email protected]<mailto:[email protected]>
>>
>> Betreff: Re: tesseract.exe has stopped working on win2008 r2
>>
>>
>>
>> Here is the screenshot and the tif file. Dmitri if you rename the .exe
>> that should work. I'm trying to get the traning data up.
>>
>> ________________________________
>> This message is confidential and intended only for the addressee. If you
>> have received this message in error, please immediately notify the
>> [email protected]<mailto:[email protected]> and delete it from your system 
>> as well as any copies. The
>> content of e-mails as well as traffic data may be monitored by NDS for
>> employment and security purposes.
>> To protect the environment please do not print this e-mail unless
>> necessary.
>>
>> An NDS Group Limited company. www.nds.com<http://www.nds.com>
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "tesseract-ocr" group.
>> To post to this group, send email to 
>> [email protected]<mailto:[email protected]>.
>> To unsubscribe from this group, send email to
>> [email protected]<mailto:tesseract-ocr%[email protected]>.
>> For more options, visit this group at
>> http://groups.google.com/group/tesseract-ocr?hl=en.
>
> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To post to this group, send email to 
> [email protected]<mailto:[email protected]>.
> To unsubscribe from this group, send email to
> [email protected]<mailto:tesseract-ocr%[email protected]>.
> For more options, visit this group at
> http://groups.google.com/group/tesseract-ocr?hl=en.
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/tesseract-ocr?hl=en.

--- Begin Message ---
According irfanview, is compressed as - LZW tif file of 300 DPI   What Quan 
says is correct  image is heavily compressed tif one. Tesseract-OCR is 
supported only uncompressed tif file only from my experience.


On Sat, Mar 26, 2011 at 6:17 PM, Quan Nguyen 
<[email protected]<mailto:[email protected]>> wrote:


The image appears to have been heavily compressed. OCR the whole image
did not yield anything. Doing it blockwise, I got some results but not
very accurate:

Ch Juhe 24, 2@@9 the ACHP vctect ct: revisect teccmmehdettcns tcr
mee_s1es-muhqes-t'ube[[e (NR/H~
‘evictetnce ct tmmuhity’ requtrementstcr heetthcete teefschheh‘. The
Heatthcate thtecttctn Ochtrct
Ptectices Aciviscry Ccmrmttee (HHCPAG) has ernctcfsed these changes.

--

You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To post to this group, send email to 
[email protected]<mailto:[email protected]>.
To unsubscribe from this group, send email to 
[email protected]<mailto:tesseract-ocr%[email protected]>.
For more options, visit this group at 
http://groups.google.com/group/tesseract-ocr?hl=en.




--
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/tesseract-ocr?hl=en.


--- End Message ---

Reply via email to