you should limit the tesseract not to output character but only number and
% sign
On Monday, April 4, 2016 at 2:42:45 PM UTC+8, Vasa Serafin wrote:
>
> Hi community,
>
> I have been playing around with the engine and have found some issues with
> some pictures, I am using bitmaps generated by the computer on diagrams
> that I create that then change regularly.
>
> The issue I have is that the text, which is numeric in nature, is not
> being identified, or is identified wrong (not by much, but enough).
>
> Attached is an example image, the image shows 13.00%, this is sometimes
> identified as I3.00% or I 3.00X, or I3.0096.
>
> I can understand why this occurs as they are similar to the engine, but
> when I increase the image size, it works better, which is expected and
> supported by the optimization documentation, optimal size is 300DPI.
>
> I would like some guidance as to any flags or the like, or even an
> advanced numeric trainingdata that can help in this regard.
>
> Any advice or tips or even a guide to better utilization of the engine
> would be appreciated.
>
> Thanks.
>
> PS. Current code:
>
> engine = new TesseractEngine(@"./tessdata", "eng", EngineMode.
> TesseractOnly, "config");
>
> private string Decypher_add_entries(Bitmap bitmap, int blowupW, int
> blowupH)
> {
> bitmap = ResizeImage(bitmap, bitmap.Width * blowupW,
> bitmap.Height * blowupH);
>
> string text = "";
>
> //var i = 1;
> using (var page = engine.Process(bitmap))
> {
> text = page.GetText();
> }
>
> return text;
> }
>
> I might not be utilizing all the available commands that can assist me,
> thats all the code I use for implementation which is a fairly simple 3-4
> lines of code.
>
--
You received this message because you are subscribed to the Google Groups
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit
https://groups.google.com/d/msgid/tesseract-ocr/a3e63782-6f19-4102-9319-58ae331255c9%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.