because i use "[lang].[fontname].[num].tif", i thought that's just a name , 
and i'm a new guy, before not using google code. and not found document or 
example , just locking some article ,and do it. very sorr.

font_properties, am i wrong? i see the document say, content must is: <font 
name> 0 0 0 0 0 . 

i'm trying tomorrow , thanks .

在 2013年1月14日星期一UTC+8上午1时06分22秒,zdenop写道:
>
> If you want to help, that make sure you read documentation[1], follow it 
> closely and search forum/issues. Making multiple posts (forum+issues)  will 
> not help you.
>
> Just reading your post it is clear that you do not follow wiki at least in 
> there cases:
>
>    - name of input files. If documentation states it should be "[lang].[
>    fontname].exp[num].tif" why do you use "[lang].[fontname].[num].tif"??? 
>    - font_properties - it is not according documentation.
>
>  If you want to run traning for non-latin based language - make sure you 
> are able to run it for English first. There are reported some problems with 
> LTR training, so it will help you to eliminate problems with not following 
> documentation and possible problems with non-latin based language.
>
> [1] https://code.google.com/p/tesseract-ocr/wiki/TrainingTesseract3
>
> Zdenko
>
>
> On Sun, Jan 13, 2013 at 3:57 AM, gold snake <[email protected]<javascript:>
> > wrote:
>
>> help~~~~
>>
>> 在 2013年1月12日星期六UTC+8下午4时15分09秒,gold snake写道:
>>
>>> *the display error content is :*
>>> D:\Little\Tesseract-OCR\build>**shapeclustering -F font_properties -U 
>>> unicharset -
>>> O oybab.unicharset oybab.A.0.tr
>>> Reading oybab.A.0.tr ...
>>> Font id = -1/0, class id = 1/2 on sample 0
>>> font_id >= 0 && font_id < font_id_map_.SparseSize():**Error:Assert 
>>> failed:in file
>>> ..\..\classify\**trainingsampleset.cpp, line 622
>>>
>>> *there is my font_properties file content:*
>>> TheFont 0 0 0 0 0
>>>
>>> *there is when i make tr files commandLine display content:*
>>> D:\Little\Tesseract-OCR\build>**tesseract oybab.A.0.tif oybab.A.0 
>>> nobatch box.trai
>>> n
>>> Tesseract Open Source OCR Engine v3.02 with Leptonica
>>> TIFFReadDirectory: Warning, TIFFstream: wrong data type 7 for 
>>> "RichTIFFIPTC"; ta
>>> g ignored.
>>> TIFFReadDirectory: Warning, TIFFstream: unknown field with tag 37724 
>>> (0x935c) en
>>> countered.
>>>  TIFFReadDirectory: Warning, TIFFstream: wrong data type 7 for 
>>> "RichTIFFIPTC"; ta
>>> g ignored.
>>> TIFFReadDirectory: Warning, TIFFstream: unknown field with tag 37724 
>>> (0x935c) en
>>> countered.
>>> TIFFReadDirectory: Warning, TIFFstream: wrong data type 7 for 
>>> "RichTIFFIPTC"; ta
>>> g ignored.
>>> TIFFReadDirectory: Warning, TIFFstream: unknown field with tag 37724 
>>> (0x935c) en
>>> countered.
>>> row xheight=120.333, but median xheight = 83.5
>>> row xheight=46.6667, but median xheight = 83.5
>>> APPLY_BOXES: boxfile line 3/卅 ((312,53),(385,204)): FAILURE! Couldn't 
>>> find a ma
>>> tching blob
>>> APPLY_BOXES:
>>>    Boxes read from boxfile:       4
>>>    Boxes failed resegmentation:       1
>>> APPLY_BOXES: Unlabelled word at :Bounding box=(312,53)->(369,122)
>>>    Found 3 good blobs.
>>>    1 remaining unlabelled words deleted.
>>>
>>>
>>>
>>>
>>> *there is my box file content:*
>>> ئ 18 48 142 227 0
>>> ئ 173 43 218 223 0
>>> ئ 254 39 274 228 0
>>> ئ 312 53 385 204 0
>>>
>>> *ps: my language is something like arab, it's writing right to left. 
>>>  so what is the problem ??? please help. thanks so much...*
>>>
>>  -- 
>> You received this message because you are subscribed to the Google
>> Groups "tesseract-ocr" group.
>> To post to this group, send email to [email protected]<javascript:>
>> To unsubscribe from this group, send email to
>> [email protected] <javascript:>
>> For more options, visit this group at
>> http://groups.google.com/group/tesseract-ocr?hl=en
>>
>
>

-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

Reply via email to