Thanks a lot for your help Ray, you have been a good help for me. Best Regards, Lincolin
On Nov 24, 7:09 pm, "Ray Smith" <[EMAIL PROTECTED]> wrote: > Sorry, there seems to be a problem with the fixed pitch detector. I haven't > had chance to investigate it completely yet, and I don't remember exactly, > but it is possible that the fixed pitch detector doesn't work on modern > fixed pitch. On old typewriters, the spaces are the same size as the > characters, but on modern (word processed/laser or inkjet printed) fixed > pitch, the size of the spaces is variable, and this is probably preventing > the fixed pitch detection algorithm form seeing the text as fixed pitch. > Font size can be calculated as follows: > ROW* row = ...; > double pt_size = (row->x_height() + row->ascenders() - row->descenders()) * > 72.0 / resolution; > Where row can be obtained from a ROW_RES, and reolution is the input > resolution of the image. > See write_shm_text in output.cpp. > > Ray. > > > > On Sun, Nov 23, 2008 at 2:04 AM, Lincolin <[EMAIL PROTECTED]> wrote: > > > Dear Ray, > > > I am waiting an answer from you on my message below please, I need it > > urgently: > > > ======================================================== > > I have tried the WERD::flag(W_DONT_CHOP) on a different documents > > that > > contains fixed pitch fonts like Courier New or Lucida Console but > > this > > flag is ALWAYS (0), is there anything else that I need to do to get > > it > > to work? > > Also is there a way to know the font size of the characters/words? > > ======================================================== > > > Thanks a lot in advance, > > Lincolin. > > > On Nov 6, 3:36 am, "Ray Smith" <[EMAIL PROTECTED]> wrote: > > > Since Tesseract never got used in an HP product, none of these things > > were > > > ever necessary enough to add.Serif/sans no, but maybe in 3.0Sub/super no. > > > You would have to evaluate that yourself from the bounding box > > > Underlined no, but it does detect underlines, so there is a chance the > > > information can be recovered. > > > Strikethrough no, and not much hope either. > > > Ray. > > > > On Wed, Nov 5, 2008 at 5:13 AM, Lincolin <[EMAIL PROTECTED]> wrote: > > > > > Thanks a lot Ray, but is there anyway to get the following properties > > > > too? > > > > - Serif and Sans-Serif. > > > > - Subscript and Superscript. > > > > - Underlined. > > > > - Strikethrough > > > > > Thanks alot for your reply Ray. > > > > Lincolin > > > > > On Nov 5, 8:35 am, "Ray Smith" <[EMAIL PROTECTED]> wrote: > > > > > The bold and italic indicators are currently incorrect. In 3.0 this > > state > > > > > should be fixed. The fixed pitch v proportional indicator is reliable > > > > > though. WERD::flag(W_DONT_CHOP) indicates fixed pitch.Ray. > > > > > > On Mon, Nov 3, 2008 at 9:30 AM, Lincolin <[EMAIL PROTECTED]> > > wrote: > > > > > > > I have been trying to get the font information from tesseract like > > > > > > proportional, serif, sans-serif, bold, italic, ...etc and I have > > tried > > > > > > to trace the tesseract returned values for some of these properties > > > > > > such like Italic, bold and proportional inside WERD_RES structure > > but > > > > > > I didn't understand what these values means since they are not > > > > > > booleans and they have different values (sometime negative ones). > > > > > > Does anyone know a way to get these information in order to create > > the > > > > > > proper font? > > > > > > > Thanks a lot in advanced, > > > > > > Lincolin- Hide quoted text - > > > > > > - Show quoted text -- Hide quoted text - > > > > - Show quoted text -- Hide quoted text - > > - Show quoted text - --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en -~----------~----~----~----~------~----~------~--~---

