Did you managed to compile this under Visual Studio 2010.


On Thu, Jul 7, 2011 at 12:18 AM, Quan Nguyen <[email protected]> wrote:
> Andreas,
>
> By scaling the screenshots to a higher resolution, to about 300 DPI,
> you'd likely get better results. VietOCR.NET has a Screenshot mode
> that performs this rescaling. You may want to check it out.
>
> I believe the language packs included in tesseractdotnet are
> Tesseract's standard issues. The eng seems to work very well for
> Windows' standard fonts. Check the site 
> http://code.google.com/p/tesseract-ocr/
> for more info.
>
> Quan
>
> On Jul 6, 12:13 pm, Andreas Reiff <[email protected]> wrote:
>> Thanks, Sven!
>>
>> Actually, that is what I did, and it is working.. between better and
>> great (of course, my expectations go up with me seeing what is
>> possible).
>>
>> I have also written my own convert-to-grayscale and max-contrast,
>> allowing even for some increase above max values (putting some values
>> close to max and min to black and white as well).
>>
>> It is working a lot better.
>>
>> I still get occasional bad results (though way less frequently).
>>
>> So, for anyone wanting to do this as well: scaling up the image by a
>> factor of 3 and increasing the contrast improves recognition quality a
>> lot.
>>
>> I wonder: is there any testdata for Windows with standard fonts? Or
>> how to approach this?
>>
>> Best wishes
>> Andreas
>>
>> On 6 Jul., 17:57, Sven Pedersen <[email protected]> wrote:
>>
>>
>>
>> > For screen captures it is necessary to increase the resolution, since
>> > it is usually 72-90dpi you must rescale them to 200-300dpi, then
>> > you'll see a drastic improvement in accuracy. I don't know anything
>> > about the C# stuff though...
>> > --Sven
>>
>> > On Wed, Jul 6, 2011 at 9:02 AM, Andreas Reiff <[email protected]> 
>> > wrote:
>> > > Hello Quan!
>>
>> > > That did the trick, many thanks!
>>
>> > > By the way, I am using unsafe, because it is in the example
>> > > Simple1.cs.
>>
>> > > Apart from that, I would rather not use it, since it propagates up..
>> > > and it doesn't prevent an application from crashing anyway.
>>
>> > > If you find the time, could you answer one more related question: I
>> > > want to do screen text recognition, like text on menus, in notepad,
>> > > and the like. Your testdata seems to be rather bad for this (now that
>> > > it is running, I could test). How best to handle this? Create/get new
>> > > testdata? Is it possible to use it without testdata at all?
>>
>> > > I would have expected screen recognition to be especially easy, since
>> > > there is no noise. But then again, I have spent too little time to
>> > > look into this yet.
>>
>> > > Best wishes,
>> > > Andreas
>>
>> > > On 6 Jul., 14:05, Quan Nguyen <[email protected]> wrote:
>> > >> Andreas,
>>
>> > >> Try adding a slash to the data path, such as:
>>
>> > >> string tessdataFolder = @"D:\Temp\IPoVnOCRer\IPoVn\Test\Tessdata\";
>>
>> > >> I'm curious as to why you use unsafe block in your code.
>>
>> > >> Quan
>>
>> > >> On Jul 6, 5:01 am, Andreas Reiff <[email protected]> wrote:
>>
>> > >> > I get an AccessViolationException, trying to adapt your code for my
>> > >> > needs: Attempted to read or write protected memory. This is often an
>> > >> > indication that other memory is corrupt.
>>
>> > >> > The code is more or less copied from your simple1 - my bitmap does not
>> > >> > come out of a file but from a screenshot (part of the screen).
>>
>> > >> > public static void Recognize(Bitmap bmp)
>> > >> > {
>> > >> >     string language = "eng";
>> > >> >     int oem = (int)eOcrEngineMode.OEM_DEFAULT;
>>
>> > >> >     using (TesseractProcessor processor = new TesseractProcessor())
>> > >> >     {
>> > >> >         DateTime started = DateTime.Now;
>> > >> >         DateTime ended = DateTime.Now;
>>
>> > >> >         string tessdataFolder = @"D:\Temp\IPoVnOCRer\IPoVn\Test
>> > >> > \Tessdata";
>>
>> > >> >         processor.Init(tessdataFolder, language, oem);
>>
>> > >> >         string text = "";
>> > >> >         unsafe
>> > >> >         {
>> > >> >             started = DateTime.Now;
>> > >> >             text = processor.Recognize(bmp);
>> > >> >             ended = DateTime.Now;
>>
>> > >> >             Console.WriteLine("Duration recognition: {0} ms\n\n",
>> > >> > (ended - started).TotalMilliseconds);
>> > >> >         }
>>
>> > >> >         Console.WriteLine(
>> > >> >             string.Format("RecognizeMode: {1}\nRecognized 
>> > >> > Text:\n{0}\n+
>> > >> > +++++++++++++++++++++++++++++++\n", text,
>> > >> > ((eOcrEngineMode)oem).ToString()));
>>
>> > >> >     }
>>
>> > >> > }
>>
>> > >> > BTW, thx for writing a wrapper - if it works, it solves just about all
>> > >> > my problems. :)- Zitierten Text ausblenden -
>>
>> > >> - Zitierten Text anzeigen -
>>
>> > > --
>> > > You received this message because you are subscribed to the Google
>> > > Groups "tesseract-ocr" group.
>> > > To post to this group, send email to [email protected]
>> > > To unsubscribe from this group, send email to
>> > > [email protected]
>> > > For more options, visit this group at
>> > >http://groups.google.com/group/tesseract-ocr?hl=en
>>
>> > --
>> > ``All that is gold does not glitter,
>> >   not all those who wander are lost;
>> > the old that is strong does not wither,
>> >   deep roots are not reached by the frost.
>> > From the ashes a fire shall be woken,
>> >   a light from the shadows shall spring;
>> > renewed shall be blade that was broken,
>> >   the crownless again shall be king.”- Zitierten Text ausblenden -
>>
>> > - Zitierten Text anzeigen -- Hide quoted text -
>>
>> - Show quoted text -
>
> --
> You received this message because you are subscribed to the Google
> Groups "tesseract-ocr" group.
> To post to this group, send email to [email protected]
> To unsubscribe from this group, send email to
> [email protected]
> For more options, visit this group at
> http://groups.google.com/group/tesseract-ocr?hl=en
>

-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

Reply via email to