Thanks, Sven! Actually, that is what I did, and it is working.. between better and great (of course, my expectations go up with me seeing what is possible).
I have also written my own convert-to-grayscale and max-contrast, allowing even for some increase above max values (putting some values close to max and min to black and white as well). It is working a lot better. I still get occasional bad results (though way less frequently). So, for anyone wanting to do this as well: scaling up the image by a factor of 3 and increasing the contrast improves recognition quality a lot. I wonder: is there any testdata for Windows with standard fonts? Or how to approach this? Best wishes Andreas On 6 Jul., 17:57, Sven Pedersen <[email protected]> wrote: > For screen captures it is necessary to increase the resolution, since > it is usually 72-90dpi you must rescale them to 200-300dpi, then > you'll see a drastic improvement in accuracy. I don't know anything > about the C# stuff though... > --Sven > > > > > > On Wed, Jul 6, 2011 at 9:02 AM, Andreas Reiff <[email protected]> > wrote: > > Hello Quan! > > > That did the trick, many thanks! > > > By the way, I am using unsafe, because it is in the example > > Simple1.cs. > > > Apart from that, I would rather not use it, since it propagates up.. > > and it doesn't prevent an application from crashing anyway. > > > If you find the time, could you answer one more related question: I > > want to do screen text recognition, like text on menus, in notepad, > > and the like. Your testdata seems to be rather bad for this (now that > > it is running, I could test). How best to handle this? Create/get new > > testdata? Is it possible to use it without testdata at all? > > > I would have expected screen recognition to be especially easy, since > > there is no noise. But then again, I have spent too little time to > > look into this yet. > > > Best wishes, > > Andreas > > > On 6 Jul., 14:05, Quan Nguyen <[email protected]> wrote: > >> Andreas, > > >> Try adding a slash to the data path, such as: > > >> string tessdataFolder = @"D:\Temp\IPoVnOCRer\IPoVn\Test\Tessdata\"; > > >> I'm curious as to why you use unsafe block in your code. > > >> Quan > > >> On Jul 6, 5:01 am, Andreas Reiff <[email protected]> wrote: > > >> > I get an AccessViolationException, trying to adapt your code for my > >> > needs: Attempted to read or write protected memory. This is often an > >> > indication that other memory is corrupt. > > >> > The code is more or less copied from your simple1 - my bitmap does not > >> > come out of a file but from a screenshot (part of the screen). > > >> > public static void Recognize(Bitmap bmp) > >> > { > >> > string language = "eng"; > >> > int oem = (int)eOcrEngineMode.OEM_DEFAULT; > > >> > using (TesseractProcessor processor = new TesseractProcessor()) > >> > { > >> > DateTime started = DateTime.Now; > >> > DateTime ended = DateTime.Now; > > >> > string tessdataFolder = @"D:\Temp\IPoVnOCRer\IPoVn\Test > >> > \Tessdata"; > > >> > processor.Init(tessdataFolder, language, oem); > > >> > string text = ""; > >> > unsafe > >> > { > >> > started = DateTime.Now; > >> > text = processor.Recognize(bmp); > >> > ended = DateTime.Now; > > >> > Console.WriteLine("Duration recognition: {0} ms\n\n", > >> > (ended - started).TotalMilliseconds); > >> > } > > >> > Console.WriteLine( > >> > string.Format("RecognizeMode: {1}\nRecognized Text:\n{0}\n+ > >> > +++++++++++++++++++++++++++++++\n", text, > >> > ((eOcrEngineMode)oem).ToString())); > > >> > } > > >> > } > > >> > BTW, thx for writing a wrapper - if it works, it solves just about all > >> > my problems. :)- Zitierten Text ausblenden - > > >> - Zitierten Text anzeigen - > > > -- > > You received this message because you are subscribed to the Google > > Groups "tesseract-ocr" group. > > To post to this group, send email to [email protected] > > To unsubscribe from this group, send email to > > [email protected] > > For more options, visit this group at > >http://groups.google.com/group/tesseract-ocr?hl=en > > -- > ``All that is gold does not glitter, > not all those who wander are lost; > the old that is strong does not wither, > deep roots are not reached by the frost. > From the ashes a fire shall be woken, > a light from the shadows shall spring; > renewed shall be blade that was broken, > the crownless again shall be king.”- Zitierten Text ausblenden - > > - Zitierten Text anzeigen - -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en

