Did you managed to compile this under Visual Studio 2010.
On Thu, Jul 7, 2011 at 12:18 AM, Quan Nguyen <[email protected]> wrote: > Andreas, > > By scaling the screenshots to a higher resolution, to about 300 DPI, > you'd likely get better results. VietOCR.NET has a Screenshot mode > that performs this rescaling. You may want to check it out. > > I believe the language packs included in tesseractdotnet are > Tesseract's standard issues. The eng seems to work very well for > Windows' standard fonts. Check the site > http://code.google.com/p/tesseract-ocr/ > for more info. > > Quan > > On Jul 6, 12:13 pm, Andreas Reiff <[email protected]> wrote: >> Thanks, Sven! >> >> Actually, that is what I did, and it is working.. between better and >> great (of course, my expectations go up with me seeing what is >> possible). >> >> I have also written my own convert-to-grayscale and max-contrast, >> allowing even for some increase above max values (putting some values >> close to max and min to black and white as well). >> >> It is working a lot better. >> >> I still get occasional bad results (though way less frequently). >> >> So, for anyone wanting to do this as well: scaling up the image by a >> factor of 3 and increasing the contrast improves recognition quality a >> lot. >> >> I wonder: is there any testdata for Windows with standard fonts? Or >> how to approach this? >> >> Best wishes >> Andreas >> >> On 6 Jul., 17:57, Sven Pedersen <[email protected]> wrote: >> >> >> >> > For screen captures it is necessary to increase the resolution, since >> > it is usually 72-90dpi you must rescale them to 200-300dpi, then >> > you'll see a drastic improvement in accuracy. I don't know anything >> > about the C# stuff though... >> > --Sven >> >> > On Wed, Jul 6, 2011 at 9:02 AM, Andreas Reiff <[email protected]> >> > wrote: >> > > Hello Quan! >> >> > > That did the trick, many thanks! >> >> > > By the way, I am using unsafe, because it is in the example >> > > Simple1.cs. >> >> > > Apart from that, I would rather not use it, since it propagates up.. >> > > and it doesn't prevent an application from crashing anyway. >> >> > > If you find the time, could you answer one more related question: I >> > > want to do screen text recognition, like text on menus, in notepad, >> > > and the like. Your testdata seems to be rather bad for this (now that >> > > it is running, I could test). How best to handle this? Create/get new >> > > testdata? Is it possible to use it without testdata at all? >> >> > > I would have expected screen recognition to be especially easy, since >> > > there is no noise. But then again, I have spent too little time to >> > > look into this yet. >> >> > > Best wishes, >> > > Andreas >> >> > > On 6 Jul., 14:05, Quan Nguyen <[email protected]> wrote: >> > >> Andreas, >> >> > >> Try adding a slash to the data path, such as: >> >> > >> string tessdataFolder = @"D:\Temp\IPoVnOCRer\IPoVn\Test\Tessdata\"; >> >> > >> I'm curious as to why you use unsafe block in your code. >> >> > >> Quan >> >> > >> On Jul 6, 5:01 am, Andreas Reiff <[email protected]> wrote: >> >> > >> > I get an AccessViolationException, trying to adapt your code for my >> > >> > needs: Attempted to read or write protected memory. This is often an >> > >> > indication that other memory is corrupt. >> >> > >> > The code is more or less copied from your simple1 - my bitmap does not >> > >> > come out of a file but from a screenshot (part of the screen). >> >> > >> > public static void Recognize(Bitmap bmp) >> > >> > { >> > >> > string language = "eng"; >> > >> > int oem = (int)eOcrEngineMode.OEM_DEFAULT; >> >> > >> > using (TesseractProcessor processor = new TesseractProcessor()) >> > >> > { >> > >> > DateTime started = DateTime.Now; >> > >> > DateTime ended = DateTime.Now; >> >> > >> > string tessdataFolder = @"D:\Temp\IPoVnOCRer\IPoVn\Test >> > >> > \Tessdata"; >> >> > >> > processor.Init(tessdataFolder, language, oem); >> >> > >> > string text = ""; >> > >> > unsafe >> > >> > { >> > >> > started = DateTime.Now; >> > >> > text = processor.Recognize(bmp); >> > >> > ended = DateTime.Now; >> >> > >> > Console.WriteLine("Duration recognition: {0} ms\n\n", >> > >> > (ended - started).TotalMilliseconds); >> > >> > } >> >> > >> > Console.WriteLine( >> > >> > string.Format("RecognizeMode: {1}\nRecognized >> > >> > Text:\n{0}\n+ >> > >> > +++++++++++++++++++++++++++++++\n", text, >> > >> > ((eOcrEngineMode)oem).ToString())); >> >> > >> > } >> >> > >> > } >> >> > >> > BTW, thx for writing a wrapper - if it works, it solves just about all >> > >> > my problems. :)- Zitierten Text ausblenden - >> >> > >> - Zitierten Text anzeigen - >> >> > > -- >> > > You received this message because you are subscribed to the Google >> > > Groups "tesseract-ocr" group. >> > > To post to this group, send email to [email protected] >> > > To unsubscribe from this group, send email to >> > > [email protected] >> > > For more options, visit this group at >> > >http://groups.google.com/group/tesseract-ocr?hl=en >> >> > -- >> > ``All that is gold does not glitter, >> > not all those who wander are lost; >> > the old that is strong does not wither, >> > deep roots are not reached by the frost. >> > From the ashes a fire shall be woken, >> > a light from the shadows shall spring; >> > renewed shall be blade that was broken, >> > the crownless again shall be king.”- Zitierten Text ausblenden - >> >> > - Zitierten Text anzeigen -- Hide quoted text - >> >> - Show quoted text - > > -- > You received this message because you are subscribed to the Google > Groups "tesseract-ocr" group. > To post to this group, send email to [email protected] > To unsubscribe from this group, send email to > [email protected] > For more options, visit this group at > http://groups.google.com/group/tesseract-ocr?hl=en > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en

