On Wed, Jun 19, 2013 at 3:20 PM, llozano <[email protected]> wrote:

> Francesco,
>
> Do you mind to post how this uzn file may look like
>

Have a look at  (e.g.)
https://isri-ocr-evaluation-tools.googlecode.com/files/zset.2B.tar.gz

and how should be the entire command?
>

As far as I remember if you use psm > 3 tesseract will look for uzn file
(based on image name). If you are on linux you can check it with strace
easily.

So you can try something like this:
tesseract 8309_016.2B.tif 8309_016.2B_psm4 -psm 4


> I'm starting to research this area for one project and I a bit puzzled.
> All I know is I need to specify areas to extract text from a document.
> Document is layout in tables. Do I need to remove the lines if I specify
> areas?
>

The best way is to make your test and share your findings.

>
> Thanks
>
>
> On Thursday, July 5, 2012 11:00:10 AM UTC-4, Di Perna Francesco wrote:
>>
>> Ok. No one can help me.
>> I have found the solution anyway....:-)
>> Calling tesseract with parameter "-psm 4" and renaming the uzn file
>> with the same name of the image seem works.
>> Bye
>>
>> On 4 Lug, 13:16, Di Perna Francesco <[email protected]>
>> wrote:
>> > Hi, we use tesseract in a web application to recognize some numer in
>> > document aquired with scanner.
>> > With tesseract2 we have used the "uzn" file to indicate in wich area
>> > of the tiff file are the numers to be recognize (the uzn file shoud
>> > have the same name of the tiff file witch "uzn" extension).
>> > We have now intalled tesseract 3, my error was to suppose that the uzn
>> > file work as the previous version, but doesn't.
>> > Can anyone explain me how recognize some area of the file in tesseract
>> > 3?
>> > Regards
>
>  --
> --
> You received this message because you are subscribed to the Google
> Groups "tesseract-ocr" group.
> To post to this group, send email to [email protected]
> To unsubscribe from this group, send email to
> [email protected]
> For more options, visit this group at
> http://groups.google.com/group/tesseract-ocr?hl=en
>
> ---
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> For more options, visit https://groups.google.com/groups/opt_out.
>
>
>

-- 
-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

--- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.


Reply via email to