Any ideas?
Mert T schrieb am Donnerstag, 8. Februar 2024 um 17:16:16 UTC+1:
> Hello,
>
> I'm new to Tesseract and have the problem that the text recognition has
> many errors. What I'm doing is scanning a prescription in German, and I
> want to show only certain areas.
> So I created certain areas (marked in blue) as new Bitmaps and used them
> in the Process Image method. I edited the Bitmap with A Forge to get rid of
> the red text and make the gray text darker(Screenshot). The 'X' is not
> recognized. If any letter is recognized, the checkbox should be checked.
> I tried to get better results with a better scan quality (600 dpi), but I
> got the best results with 150 dpi.
> Tesseract has many functionalities, I tried some of them but I don't know
> how to use them well to solve my problems. Could someone help me out?
>
> Thanks.
>
> Here my Code:
>
> public string ProcessImage(Bitmap image)
> {
> image = RemovePinkTextAndMakeGrayTextDarker(image);
>
> using var engine = new TesseractEngine("./tessdata", "deu",
> EngineMode.Default);
> using var img = PixConverter.ToPix(image);
> using var page = engine.Process(image, PageSegMode.AutoOsd);
> return page.GetText();
> }
>
> private Bitmap RemovePinkTextAndMakeGrayTextDarker(Bitmap image)
> {
> var filter = new EuclideanColorFiltering
> {
> CenterColor = new RGB(Color.HotPink),
> Radius = 80,
> FillColor = new RGB(Color.White),
> FillOutside = false
> };
> filter.ApplyInPlace(image);
>
> var filter3 = new EuclideanColorFiltering
> {
> CenterColor = new RGB(Color.DarkGray),
> Radius = 80,
> FillColor = new RGB(Color.Black),
> FillOutside = false
> };
> filter3.ApplyInPlace(image);
>
> return image;
> }
>
> [image: 150 scan.png]
>
> [image: Screenshot marked.png]
>
> [image: Scanarea.png]
>
>
>
--
You received this message because you are subscribed to the Google Groups
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/tesseract-ocr/e618ccd0-3832-4c22-8b2b-b90769cc9d2an%40googlegroups.com.