Hi,

that would also be my next question. Don't we need anything like a 
seperator? Some examples would be great. The amout of information on the 
internet is very poor as tesseract 4 is new.

Am Sonntag, 27. Januar 2019 18:20:06 UTC+1 schrieb Li-Chung Chou:
>
> Hi Timothy,
>
> I have the same question with Jul. Would you kindly share 1 'textline' 
> boxes file and its corresponding image file which you applied? I assume if 
> I have one image containing one 'textline' as "Thanks", then I will have 
> its corresponding box file as below contents:
>
> Thanks 10 10 500 30 0  //the 10 10 500 30 rectangle contains whole 
> "Thanks" text?
>
> But I was wondering if my 'textline' has space character in it, does it 
> still work? For example, if I have an image containing one 'textline' as 
> "Thank you", will its box file looks like this?
>
> Thank you 10 10 800 30 0 //the 10 10 800 30 rectangle contains whole 
> "Thank you" text?
>
> Not sure if my understainding is correct or not - it's highly appreciated 
> if you can share some examples or experience to us. Thank you very very 
> much!
>
> Li-Chung
>
> Timothy Snyder於 2019年1月25日星期五 UTC+8下午10時47分47秒寫道:
>>
>> I have successfully trained Tesseract 4.0 using boxes that cover an 
>> entire line. I was similarly confused by the mismatch between the docs and 
>> that example. I haven't tested training with character-bounding boxes but I 
>> can confirm that textline boxes works fine.
>>
>> On Fri, Jan 25, 2019 at 5:56 AM Jul ius <[email protected]> wrote:
>>
>>> Hi,
>>>
>>> I'm interested in training tesseract 4 with real data. As the 
>>> documentation seems very poor and only captures training with font files, I 
>>> have a general question.
>>>
>>> On: 
>>> https://github.com/tesseract-ocr/tesseract/wiki/Making-Box-Files---4.0
>>>
>>> It says that the boxes need to cover the whole line in tesseract 4. 
>>>
>>> When looking inside the linked box file I can clearly see that every box 
>>> covers a single character.
>>>
>>> Can anyone verify which layout for the boxes is right?
>>>
>>> -- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "tesseract-ocr" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to [email protected].
>>> To post to this group, send email to [email protected].
>>> Visit this group at https://groups.google.com/group/tesseract-ocr.
>>> To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/tesseract-ocr/1ab1e0b0-a70a-456b-ab58-2f240a3b479f%40googlegroups.com
>>>  
>>> <https://groups.google.com/d/msgid/tesseract-ocr/1ab1e0b0-a70a-456b-ab58-2f240a3b479f%40googlegroups.com?utm_medium=email&utm_source=footer>
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/d69b92fd-25de-4b55-9ade-f363def05314%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to