This is awesome. Thanks for your reply. So, one more for you.. just to 
clarify..
In your command example: 8309_016.2B_psm4 should be with the prefix _psm4? 
Is it true or just mistype? 
Do I need to pass the tiff file through some filter to remove colors or 
something like that? The examples you shared in your tar.gz file, which are 
awesome, there are in gray scales and not sure about the resolution. Is 
there some preparation of the image in order to improve output?

Thanks again!



On Wednesday, June 19, 2013 2:18:40 PM UTC-4, zdenop wrote:
>
>
> On Wed, Jun 19, 2013 at 3:20 PM, llozano <[email protected] 
> <javascript:>>wrote:
>
>> Francesco,
>>
>> Do you mind to post how this uzn file may look like
>>
>
> Have a look at  (e.g.) 
> https://isri-ocr-evaluation-tools.googlecode.com/files/zset.2B.tar.gz
>
> and how should be the entire command?
>>
>
> As far as I remember if you use psm > 3 tesseract will look for uzn file 
> (based on image name). If you are on linux you can check it with strace 
> easily.
>
> So you can try something like this:
> tesseract 8309_016.2B.tif 8309_016.2B_psm4 -psm 4
>  
>
>> I'm starting to research this area for one project and I a bit puzzled. 
>> All I know is I need to specify areas to extract text from a document. 
>> Document is layout in tables. Do I need to remove the lines if I specify 
>> areas?
>>
>
> The best way is to make your test and share your findings.
>
>>
>> Thanks
>>
>>
>> On Thursday, July 5, 2012 11:00:10 AM UTC-4, Di Perna Francesco wrote:
>>>
>>> Ok. No one can help me. 
>>> I have found the solution anyway....:-) 
>>> Calling tesseract with parameter "-psm 4" and renaming the uzn file 
>>> with the same name of the image seem works. 
>>> Bye 
>>>
>>> On 4 Lug, 13:16, Di Perna Francesco <[email protected]> 
>>> wrote: 
>>> > Hi, we use tesseract in a web application to recognize some numer in 
>>> > document aquired with scanner. 
>>> > With tesseract2 we have used the "uzn" file to indicate in wich area 
>>> > of the tiff file are the numers to be recognize (the uzn file shoud 
>>> > have the same name of the tiff file witch "uzn" extension). 
>>> > We have now intalled tesseract 3, my error was to suppose that the uzn 
>>> > file work as the previous version, but doesn't. 
>>> > Can anyone explain me how recognize some area of the file in tesseract 
>>> > 3? 
>>> > Regards
>>
>>  -- 
>> -- 
>> You received this message because you are subscribed to the Google
>> Groups "tesseract-ocr" group.
>> To post to this group, send email to [email protected]<javascript:>
>> To unsubscribe from this group, send email to
>> [email protected] <javascript:>
>> For more options, visit this group at
>> http://groups.google.com/group/tesseract-ocr?hl=en
>>  
>> --- 
>> You received this message because you are subscribed to the Google Groups 
>> "tesseract-ocr" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected] <javascript:>.
>> For more options, visit https://groups.google.com/groups/opt_out.
>>  
>>  
>>
>
>

-- 
-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

--- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.


Reply via email to