thanks,  the problem is fixed now,because the font_properties and the [lang
].[fontname].exp[num] on the command , must same.

but one thing i cant understand. the fontname is a real font Name?? or just 
a mark?? if it's a real font name , the program is using or not? if my font 
name have a space in the middle ,how can i do? font name like: <My Font>.

very thanks...

在 2013年1月14日星期一UTC+8上午2时55分49秒,zdenop写道:
>
> On Sun, Jan 13, 2013 at 6:06 PM, zdenko podobny <[email protected]<javascript:>
> > wrote:
>
>> If you want to help, that make sure you read documentation[1], follow it 
>> closely and search forum/issues. Making multiple posts (forum+issues)  will 
>> not help you.
>>
>> Just reading your post it is clear that you do not follow wiki at least 
>> in there cases:
>>
>>    - name of input files. If documentation states it should be "[lang].[
>>    fontname].exp[num].tif" why do you use "[lang].[fontname].[num].tif"
>>    ??? 
>>    - font_properties - it is not according documentation.
>>
>> If you want to run traning for non-latin based language - make sure you 
>> are able to run it for English first. There are reported some problems with 
>> LTR training,
>>
>
> Ups it should be RTL training...
>  
>
>> so it will help you to eliminate problems with not following 
>> documentation and possible problems with non-latin based language.
>>
>> [1] https://code.google.com/p/tesseract-ocr/wiki/TrainingTesseract3
>>
>> Zdenko
>>
>>
>> On Sun, Jan 13, 2013 at 3:57 AM, gold snake <[email protected]<javascript:>
>> > wrote:
>>
>>> help~~~~
>>>
>>> 在 2013年1月12日星期六UTC+8下午4时15分09秒,gold snake写道:
>>>
>>>> *the display error content is :*
>>>> D:\Little\Tesseract-OCR\build>**shapeclustering -F font_properties -U 
>>>> unicharset -
>>>> O oybab.unicharset oybab.A.0.tr
>>>> Reading oybab.A.0.tr ...
>>>> Font id = -1/0, class id = 1/2 on sample 0
>>>> font_id >= 0 && font_id < font_id_map_.SparseSize():**Error:Assert 
>>>> failed:in file
>>>> ..\..\classify\**trainingsampleset.cpp, line 622
>>>>
>>>> *there is my font_properties file content:*
>>>> TheFont 0 0 0 0 0
>>>>
>>>> *there is when i make tr files commandLine display content:*
>>>> D:\Little\Tesseract-OCR\build>**tesseract oybab.A.0.tif oybab.A.0 
>>>> nobatch box.trai
>>>> n
>>>> Tesseract Open Source OCR Engine v3.02 with Leptonica
>>>> TIFFReadDirectory: Warning, TIFFstream: wrong data type 7 for 
>>>> "RichTIFFIPTC"; ta
>>>> g ignored.
>>>> TIFFReadDirectory: Warning, TIFFstream: unknown field with tag 37724 
>>>> (0x935c) en
>>>> countered.
>>>>  TIFFReadDirectory: Warning, TIFFstream: wrong data type 7 for 
>>>> "RichTIFFIPTC"; ta
>>>> g ignored.
>>>> TIFFReadDirectory: Warning, TIFFstream: unknown field with tag 37724 
>>>> (0x935c) en
>>>> countered.
>>>> TIFFReadDirectory: Warning, TIFFstream: wrong data type 7 for 
>>>> "RichTIFFIPTC"; ta
>>>> g ignored.
>>>> TIFFReadDirectory: Warning, TIFFstream: unknown field with tag 37724 
>>>> (0x935c) en
>>>> countered.
>>>> row xheight=120.333, but median xheight = 83.5
>>>> row xheight=46.6667, but median xheight = 83.5
>>>> APPLY_BOXES: boxfile line 3/卅 ((312,53),(385,204)): FAILURE! Couldn't 
>>>> find a ma
>>>> tching blob
>>>> APPLY_BOXES:
>>>>    Boxes read from boxfile:       4
>>>>    Boxes failed resegmentation:       1
>>>> APPLY_BOXES: Unlabelled word at :Bounding box=(312,53)->(369,122)
>>>>    Found 3 good blobs.
>>>>    1 remaining unlabelled words deleted.
>>>>
>>>>
>>>>
>>>>
>>>> *there is my box file content:*
>>>> ئ 18 48 142 227 0
>>>> ئ 173 43 218 223 0
>>>> ئ 254 39 274 228 0
>>>> ئ 312 53 385 204 0
>>>>
>>>> *ps: my language is something like arab, it's writing right to left. 
>>>>  so what is the problem ??? please help. thanks so much...*
>>>>
>>>  -- 
>>> You received this message because you are subscribed to the Google
>>> Groups "tesseract-ocr" group.
>>> To post to this group, send email to [email protected]<javascript:>
>>> To unsubscribe from this group, send email to
>>> [email protected] <javascript:>
>>> For more options, visit this group at
>>> http://groups.google.com/group/tesseract-ocr?hl=en
>>>
>>
>>
>

-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

Reply via email to