On Monday, November 13, 2023 at 5:35:20 AM UTC-5 [email protected] wrote:


Yeah it seems page segmentation is the crucial issue. If the bounding boxes 
are good, the recognition is usually very good.

I think I've sort of reached the limit on what I can do with base 
Tesseract. I think the next step would be custom training / fine-tuning.


Tesseract's page layout analysis / segmentation isn't training based, so I 
don't think this is going to help you. If you wanted to recognize the C/L 
glyph, you could do fine tuning training for it, but it's not going to help 
you with the problem of finding rotated text and accurately determining 
bounding boxes for text of interest.

It's been ages since I've done serious image processing, but I'd recommend 
looking at something like OpenCV's text detection:
https://docs.opencv.org/4.8.0/d4/d43/tutorial_dnn_text_spotting.html

Aspirationally, you can get some idea of what's possible by playing with 
Google's Cloud Vision API demo
https://cloud.google.com/vision/docs/drag-and-drop

It lets you just drag & drop an image and then inspect the results both 
visually and via the JSON that the API produces.

Good luck!

Tom

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/3a6e0271-db4b-4624-bada-51167dd6d744n%40googlegroups.com.

Reply via email to