Extract image content

Micael Leal Tue, 19 Mar 2013 19:01:29 -0700

Hello,

I try to implement tesseract-ocr with my powerpoint program in order to 
recognize pictures.


I can extract a picture in powerpoint, but I want to extract its content.

Inside each picture is [myvariable] draw inside and I want to extract the 
[myvariable] to use it later.

    Bitmap image = new 
Bitmap(@"C:\Users\vh610\AppData\Local\Temp\image.bmp");
    tessnet2.Tesseract ocr = new tessnet2.Tesseract();
    ocr.SetVariable("tessedit_char_whitelist", "0123456789"); // If digit 
only
    ocr.Init(@"E:\app\PPT\tessdata", "eng", false); // To use correct 
tessdata
    List<tessnet2.Word> result = ocr.DoOCR(image, Rectangle.Empty);
    foreach (tessnet2.Word word in result)
                Console.WriteLine("{0} : {1}", word.Confidence, word.Text);
                                                
The .dll was correctly implemented in the project and the program runs, but 
on "ocr.Init" it gives an error.
The error is : Unable to laod unicharset file 
E:/app/PPT/tessdata/\eng.unicharset

My Main project is located in E:\app\PPT\Source\ppt.sln and my tessdata is 
in E:\app\PPT\tessdata where I have 
grc.traineddata inside.

What am I doing wrong? Thanks

-- 
-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

--- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.

Extract image content

Reply via email to