version 2.04 fixed the problem

saving the snapshot as nocompress, noalpha j.tiff
convert j.tiff -colorspace gray j.tif
tesseract j.tif outj -l eng

sealed the deal with beautiful OCR

thanks Salahuddin and Ray

On Jun 16, 3:02 pm, timmckenna <[email protected]> wrote:
> on my macbook leopard  with tesseract-1.0.3 and the english v.2.0
> language pack and using the command
>
> L-605683-098391:Desktop teacher$ tesseract sna.tif outa -l eng
> Tesseract Open Source OCR Engine
>
> on the filehttp://sitebuilt.net/files/sna.tifI 
> gethttp://sitebuilt.net/files/outa.txt
> that looks like this:
>
> .**;.5 4 r»¢~:—;L¤1L, 11;·;;»L;y_;I1 LJ
> .><>rL¢~r1Liz;1 11; rr-—·r<>g_:1iz
> 1:-—;l.r1;·¢~r:—; ¤.¤.·I1.r> ;;r»¢~ rL;11.
> :—;rL¤l11·:I;;r·:I_ 311 I>rzuvrLi·;r<·
> 1:-—r1L¤r<» .:l<».rrisi·;;11, [1 is
> {4 I I>¢~r.rr<·11L‘| r<~.><>rL 11
> 1:-—;l.rr1;¢~r in; 1;i:—; (m-r 11:-—r
>
> The original file was a 'grab' from preview of a pdf that is
> protected. It creates a tiff image that I then convert to uncompressed
> and get rid of the alpha channel by running imageMagick like so:
>
> convert sn3.tif  -background white -flatten +matte +compress  sna.tif
>
> Any ideas? Could somebody try that tif file and see if it works on
> yours or give me one that you know works so I can try it here?
>
> Thanks
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/tesseract-ocr?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to