Hi Nick,
I've took a look at api/tesseractmain.cpp as you recommend me, but I cannot
find anything wrong, I think. Anyway, I could post my program here and try
to guess what is going on with your help.
This is my method:
___________________________________________________________________
void recognizeChar(Mat imagen){
/*INITIALIZE (TESSERACT)*/
putenv("TESSDATA_PREFIX=/usr/local/share/");
setlocale(LC_NUMERIC, "C");
tesseract::TessBaseAPI OCR;
if (OCR.Init(NULL, "spa")){
fprintf( stderr, "cannot could initialize tesseract.... \n" );
exit(1);
}
/*CONFIGURING*/
OCR.SetPageSegMode(tesseract::PSM_SINGLE_LINE);
api.SetVariable("tessedit_char_whitelist",
"ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789 ");//lista blanca
api.SetVariable("tessedit_char_blacklist" ,
"<>abcdefghijklmnopqtrstuvwxyz./!¡$%&?¿,;+-#");//lista negra
OCR.SetImage(imagen.data, imagen.size().width, imagen.size().height,
imagen.channels(), imagen.step1());
OCR.TesseractRect(imagen.data, 0, imagen.step1(), 0, 0, imagen.cols,
imagen.rows);
/*GETTING READED TEXT*/
char* texto = OCR.GetUTF8Text();
string t1=texto;
t1.erase( remove(t1.begin(), t1.end(), '\n'), t1.end() );
cout << "TEXTO: "<<t1.c_str() <<endl;
}
_______________________________________________________________________
Thank you all.
El martes, 3 de diciembre de 2013 11:29:58 UTC+1, Nick White escribió:
>
> Hi Adrian,
>
> Well then your C++ program must be wrong in some way. The command
> line version doesn't do anything special, it just uses the API like
> anything else. Take a look at api/tesseractmain.cpp to check how
> your API usage differs, to find your bug.
>
> Nick
>
> On Tue, Dec 03, 2013 at 01:16:40AM -0800, adrian company wrote:
> > Hi Sventech,
> > I've tested the image with the command line version and I get the same
> result
> > as you. But when I use my own software in C++ I cannot obtain the same
> result,
> > simply get nothing. Currently I am using PSM_SINGLE_LINE, but I've said
> before
> > I've tried all the page seg modes.
> > I don't know what is wrong. I've reinstalled tesseract and do the same.
> >
> >
> > El martes, 3 de diciembre de 2013 07:29:11 UTC+1, adrian company
> escribió:
> >
> > And about the page seg I've tried with all the page seg but I still
> get
> > anything.
> >
> > El lunes, 2 de diciembre de 2013 16:13:17 UTC+1, sventech escribió:
> >
> > I get
> > V! 2\"03ENl
> > so you could postprocess that kind of thing to get better
> results --
> > you need to eliminate the black border for best results. You may
> need
> > to remove noise. What page seg mode are you using? Make sure you
> test
> > with the command line version before you try your own. Also, I'm
> using
> > the latest version 3.02.02
> > --Sven
> >
> >
> >
> > On Mon, Dec 2, 2013 at 6:18 AM, adrian company <[email protected]>
>
> > wrote:
> >
> > Hi again, I've tried to deskew the first image and pass it
> to
> > tesseract greater, but I have the same result, the numbers
> and
> > letters are not recognized by tesseract. I post an image
> where you
> > can see how is my image now.
> > Any idea???
> > Thanks in advance again.
> >
> >
> >
> >
> >
> > El jueves, 31 de octubre de 2013 07:22:53 UTC+1, adrian
> company
> > escribió:
> >
> > Thanks Sventech, I'll try to deskew the first, i'm using
> opencv
> > to prepare the image so I cannot use any program to
> prepare it.
> > I've tried to rotate the image and pass it to tesseract
> with
> > text in horizontal but tesseract outputs the same. I
> will also
> > try to pass it to in png format and I will see the
> result.
> >
> >
> > On Wednesday, October 30, 2013 3:21:58 PM UTC+1,
> sventech
> > wrote:
> >
> > In the first image you need to deskew it. There are
> free
> > programs for preparing the image, The second image
> appears
> > to be too low resolution (or letter pixel height to
> be
> > precise). Approx. 200-300dpi is ideal for
> tesseract's
> > default training. Also, JPEG is not a good format
> for text.
> > Internally it will convert to TIFF or PNG.
> >
> >
> > On Wed, Oct 30, 2013 at 6:50 AM, adrian company <
> > [email protected]> wrote:
> >
> > Hi all, I am trying to write a software to
> recognize
> > some text from an image, but when I binarize the
> image
> > and I call to tesseract engine, this does not
> recognize
> > text in image. Does somebody know why text it is
> not
> > recognized? Must I do something extra to
> recognize?
> > I attach the image I am trying to recognize
> text
> > (license plate). In this attached image the
> tesseract
> > output is nothing.
> >
> > I've also tried to recognize text from another
> image
> > (Fuma) and in this case the output is: "L I".
> >
> > Could anybody help me?
> >
> > What could be happening?
> >
> >
> > Thanks in advance.
> > Adri
> >
> >
> >
> >
> > --
> > --
> > You received this message because you are
> subscribed to
> > the Google
> > Groups "tesseract-ocr" group.
> > To post to this group, send email to
> > [email protected]
> > To unsubscribe from this group, send email to
> > [email protected]
> > For more options, visit this group at
> >
> http://groups.google.com/group/tesseract-ocr?hl=en
> >
> > ---
> > You received this message because you are
> subscribed to
> > the Google Groups "tesseract-ocr" group.
> > To unsubscribe from this group and stop
> receiving
> > emails from it, send an email to
> > [email protected].
> > For more options, visit
> https://groups.google.com/grou
> > ps/opt_out.
> >
> >
> >
> >
> > --
> > ``All that is gold does not glitter,
> > not all those who wander are lost;
> > the old that is strong does not wither,
> > deep roots are not reached by the frost.
> > From the ashes a fire shall be woken,
> > a light from the shadows shall spring;
> > renewed shall be blade that was broken,
> > the crownless again shall be king.”
> >
> > --
> > --
> > You received this message because you are subscribed to the
> Google
> > Groups "tesseract-ocr" group.
> > To post to this group, send email to
> [email protected]
> > To unsubscribe from this group, send email to
> > [email protected]
> > For more options, visit this group at
> > http://groups.google.com/group/tesseract-ocr?hl=en
> >
> > ---
> > You received this message because you are subscribed to the
> Google
> > Groups "tesseract-ocr" group.
> > To unsubscribe from this group and stop receiving emails
> from it,
> > send an email to [email protected].
> > For more options, visit
> https://groups.google.com/groups/opt_out.
> >
> >
> >
> >
> > --
> > ``All that is gold does not glitter,
> > not all those who wander are lost;
> > the old that is strong does not wither,
> > deep roots are not reached by the frost.
> > From the ashes a fire shall be woken,
> > a light from the shadows shall spring;
> > renewed shall be blade that was broken,
> > the crownless again shall be king.”
> >
> > --
> > --
> > You received this message because you are subscribed to the Google
> > Groups "tesseract-ocr" group.
> > To post to this group, send email to
> > [email protected]<javascript:>
> > To unsubscribe from this group, send email to
> > [email protected] <javascript:>
> > For more options, visit this group at
> > http://groups.google.com/group/tesseract-ocr?hl=en
> >
> > ---
> > You received this message because you are subscribed to the Google
> Groups
> > "tesseract-ocr" group.
> > To unsubscribe from this group and stop receiving emails from it, send
> an email
> > to [email protected] <javascript:>.
> > For more options, visit https://groups.google.com/groups/opt_out.
>
--
--
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en
---
You received this message because you are subscribed to the Google Groups
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.