once again, with more information: I have a problem using tesseract with german fraktur.
I work with tesseract 3.02.02 on SUSE Linux 13.2 firstly the text to be ocr'd is real printed text of about 1930. the printing is a little dirty i.e. there are little points and strokes between the letters. though these are far smaller than the other letters, they are interpreted as normal letters.oes-frak.frak.exp017 Is there a possibility to give parameters to tesseract that it . either should neglect letters which do not fit the majority of the other letters, . or it should only use letters in a given range of size . or to firstly make the boxes, then correct the boxes, by hand or program, finally translate using the corrected boxes I have already tried with a config-file to modify textord_min_xheight 24 textord_xheight_mode_fraction 0.9 textord_xheight_error_margin 0.1 textord_descx_ratio_min 0.3 tessedit_redo_xheight FALSE it changes some things but nothing to neglect the points and strokes following an example: the appended picture is translated to the text 15 Ellser Exdmsund Mögsgzerg a solution with a dictionary is not possible, because the text consists of only names of persons and locations. Another thing i wonder is: when i ocr an image from .tiff to .txt and makebox of the same image some (few) letters are different recognized! thanks for help in advance -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/0c58a26a-a8be-4550-9fca-593669a8cf5c%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.

