[tesseract-ocr] 2 questions about training tesseract

LHW Wed, 06 May 2020 02:56:36 -0700

(My english sucks, so please understand me)
Hi, I'm studying how to train tesseract with tesstrain 
<https://github.com/tesseract-ocr/tesstrain>
I have handwriting/printed font dataset(.tif and .gt.txt pairs).
I read tutorial 
<https://tesseract-ocr.github.io/tessdoc/TrainingTesseract-4.00.html>, and 
I successfully made new traineddata file, but I want to add my dataset in 
existing .traineddata file. (ex. eng.traineddata + dataset)


1. can i add data in traineddata file that already exist? or can i merge 
two or many traineddata files in one file? (except [-l model1+model2])
2. There are syllables, words and sentences in my dataset. can i put them 
in one folder and train? or should i train each of them seperately?

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/0a1bb74c-ccc8-4e77-a883-c43c7f1f47a4%40googlegroups.com.

[tesseract-ocr] 2 questions about training tesseract

Reply via email to