The interesting part is: TextRecognitionDataGenerator does also generate 
tesseract compatible box files. But, I find no easy way to produce training 
files (such as lstm, .tif and the like ones) from the images and the box 
files made by TextRecognitionDataGenerator.  I am pretty sure a little 
experienced users already know how to do that. 
On Wednesday, November 8, 2023 at 8:51:51 AM UTC+3 Des Bw wrote:

> text2image is a great script shipped with Tesseract. It is used to 
> generate synthetic data to produce images from text files. It has a few 
> control parameters to make the generated images similar to scanned images. 
>
> But, I have lately learned that the images generated by text2image are 
> nowhere realistic as the ones generated by 
> https://github.com/Belval/TextRecognitionDataGenerator. The latter tool 
> has more powerful controls to produce the exact type of image you want to 
> generate. 
>
>
> - has anyway found a way of making tesseract work with other text 
> generation tools such as TextRecognitionDataGenerator?
> - if so, what is the experience?
> - and for the developers, is there anyways to replace text2image 
> with TextRecognitionDataGenerator?
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/68ec6c2d-560b-4c5a-86e9-7559571de584n%40googlegroups.com.

Reply via email to