On Tue, Apr 30, 2013 at 2:43 PM, Ardian Nur Fazri <[email protected]>wrote:

> i have unicharset with complex value. it's not like in literature on
> https://code.google.com/p/tesseract-ocr/wiki/TrainingTesseract3, it's
> more than it.
>
> *No matter what the documentation says, the source code is the ultimate
truth, the best and most definitive and up-to-date documentation you're
likely to find.*[1] ;-)

Suggested reading for today is unicharset.h[2] unicharset.cpp[3]

[1]
http://www.codinghorror.com/blog/2012/04/learn-to-read-the-source-luke.html
[2]
https://code.google.com/p/tesseract-ocr/source/browse/trunk/ccutil/unicharset.h
[3]
https://code.google.com/p/tesseract-ocr/source/browse/trunk/ccutil/unicharset.cpp

this is my sample of my code :
>
> a 3 0,255,0,255,0,32767,0,32767,0,32767 NULL -1 0 0 # a [61 ]a
>>
>> n 3 0,255,0,255,0,32767,0,32767,0,32767 NULL -1 0 0 # n [6e ]a
>>
>> c 3 0,255,0,255,0,32767,0,32767,0,32767 NULL -1 0 0 # c [63 ]a
>>
>> r 3 0,255,0,255,0,32767,0,32767,0,32767 NULL -1 0 0 # r [72 ]a
>>
>> k 3 0,255,0,255,0,32767,0,32767,0,32767 NULL -1 0 0 # k [6b ]a
>>
>> f 3 0,255,0,255,0,32767,0,32767,0,32767 NULL -1 0 0 # f [66 ]a
>>
>> d 3 0,255,0,255,0,32767,0,32767,0,32767 NULL -1 0 0 # d [64 ]a
>>
>> s 3 0,255,0,255,0,32767,0,32767,0,32767 NULL -1 0 0 # s [73 ]a
>>
>> w 3 0,255,0,255,0,32767,0,32767,0,32767 NULL -1 0 0 # w [77 ]a
>>
>>
> In past I did "fast" examination[4], but I never find time (reason,
priority???) for deeper evaluation of this file. So if anybody has
time/resource please feel free to have a look on this and share your
experience with community...

Here some hints:

   - there could be several version of unicharset file[5]
   - some part of data (e.g. ranges, script)  are not fill in by current
   training tools (e.g. they have default/initted values)
   - extracting of unicharset from data files provided by google can help
   with analyze...

[4] http://www.sk-spell.sk.cx/first-notes-for-tesseract-ocr-302-traning
[5]
https://code.google.com/p/tesseract-ocr/source/browse/trunk/ccutil/unicharset.cpp?r=838#682


> anybody know what for value which in green color background?
>
> --
> --
> You received this message because you are subscribed to the Google
> Groups "tesseract-ocr" group.
> To post to this group, send email to [email protected]
> To unsubscribe from this group, send email to
> [email protected]
> For more options, visit this group at
> http://groups.google.com/group/tesseract-ocr?hl=en
>
> ---
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> For more options, visit https://groups.google.com/groups/opt_out.
>
>
>

-- 
-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

--- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.


Reply via email to