On Monday, November 13, 2023 at 5:35:20 AM UTC-5 [email protected] wrote:
Yeah it seems page segmentation is the crucial issue. If the bounding boxes are good, the recognition is usually very good. I think I've sort of reached the limit on what I can do with base Tesseract. I think the next step would be custom training / fine-tuning. Tesseract's page layout analysis / segmentation isn't training based, so I don't think this is going to help you. If you wanted to recognize the C/L glyph, you could do fine tuning training for it, but it's not going to help you with the problem of finding rotated text and accurately determining bounding boxes for text of interest. It's been ages since I've done serious image processing, but I'd recommend looking at something like OpenCV's text detection: https://docs.opencv.org/4.8.0/d4/d43/tutorial_dnn_text_spotting.html Aspirationally, you can get some idea of what's possible by playing with Google's Cloud Vision API demo https://cloud.google.com/vision/docs/drag-and-drop It lets you just drag & drop an image and then inspect the results both visually and via the JSON that the API produces. Good luck! Tom -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/3a6e0271-db4b-4624-bada-51167dd6d744n%40googlegroups.com.

