psm = pagesegmode:

0 = Orientation and script detection (OSD) only.
1 = Automatic page segmentation with OSD.
2 = Automatic page segmentation, but no OSD, or OCR
3 = Fully automatic page segmentation, but no OSD. (Default)
4 = Assume a single column of text of variable sizes.
5 = Assume a single uniform block of vertically aligned text.
6 = Assume a single uniform block of text.
7 = Treat the image as a single text line.
8 = Treat the image as a single word.
9 = Treat the image as a single word in a circle.
10 = Treat the image as a single character.

it is implemented in tesseract 3.01 version (will be release "soon")...

Zdenko

On Fri, Aug 19, 2011 at 2:45 PM, Andriy Malovanyy <[email protected]>wrote:

> Thanks guys for the help! Really appreciate it!
>
> To sriranga:
> I tried changing dpi (check the previous post). It doesnt work.
>
> To Dmitri:
> I tried manually removing greyness (check the previous post). It
> doesnt work. I think the major issue is the language file. The
> charecters are probably too bold.
>
> To Andres:
> Which file did you try to recognize?? The last one, manually edited?
> Can you try recognize the other ones as well?? Jpg files work good
> with Tesseract 3.0, I have some of the files created with Photoshop
> been recognized. Btw, can you try to recognize one of them (check
> first post). Do you also get one full stop instead of two??
>
> To Zdenko:
> What does -psm option? I tried to google it and could not find an
> answer. When I try to run "tesseract.exe webcam4-2.jpg text -psm 7" I
> get:
> read_variables_files: Can't open psm
> read_variables_files: Can't open 7
>
> I googled that error as well and also got not much help.
> Can you try also processing other unedited images which I attached to
> the first post??
>
>
> On 19 Aug, 07:20, zdenko podobny <[email protected]> wrote:
> > I try it with 3.01 version and:
> > tesseract download\webcam4-2.jpg webcam4-2
> > produced empty page.
> >
> > BUT:
> > tesseract download\webcam4-2.jpg webcam4-2 -psm 7
> > produce correct result...
> >
> > Zdenko
> >
> >
> >
> >
> >
> >
> >
> > On Fri, Aug 19, 2011 at 6:54 AM, Andres <[email protected]> wrote:
> > > Hi Andriy,
> >
> > > I'm using Tesseract 2.04
> >
> > > I don't remember if it works with jpg files, when I tried it with your
> > > image I obtained:
> >
> > > Tesseract Open Source OCR Engine
> > > name_to_image_type:Error:Unrecognized image type:webcam.jpg
> > > IMAGE::read_header:Error:Can't read this image type:webcam.jpg
> > > tesseract:Error:Read of file failed:webcam.jpg
> >
> > > So I opened your file with windows paint and saved it as webcam.bmp
> >
> > > Then I executed:
> >
> > > tesseract webcam.bmp output -l eng
> >
> > > and I obtained the file "output.txt" with the correct text.
> >
> > > Regards,
> >
> > > Andres
> >
> > > 2011/8/18 Andriy Malovanyy <[email protected]>:
> > > > Forgot to add attachment
> >
> > > > --
> > > > You received this message because you are subscribed to the Google
> > > > Groups "tesseract-ocr" group.
> > > > To post to this group, send email to [email protected]
> > > > To unsubscribe from this group, send email to
> > > > [email protected]
> > > > For more options, visit this group at
> > > >http://groups.google.com/group/tesseract-ocr?hl=en
> >
> > > --
> > > You received this message because you are subscribed to the Google
> > > Groups "tesseract-ocr" group.
> > > To post to this group, send email to [email protected]
> > > To unsubscribe from this group, send email to
> > > [email protected]
> > > For more options, visit this group at
> > >http://groups.google.com/group/tesseract-ocr?hl=en
>
> --
> You received this message because you are subscribed to the Google
> Groups "tesseract-ocr" group.
> To post to this group, send email to [email protected]
> To unsubscribe from this group, send email to
> [email protected]
> For more options, visit this group at
> http://groups.google.com/group/tesseract-ocr?hl=en
>

-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

Reply via email to