fl is recognizes as ligature in English, so there could be the same issue in Italian. If it is replaced with '?' I would guess you have problem with unicode... Can you check it with tesseract executable?
Zdenko On Thu, Oct 16, 2014 at 10:18 AM, Salvo Piazza <[email protected]> wrote: > Hi all, > I've written a little simple program to extract text from image with > tesseract 3.0.2 as: > > Tesseract instance = Tesseract.getInstance(); > instance.setDatapath(currentDir); > instance.setLanguage("ita"); > String returner = instance.doOCR(new File(filename)); > > It works fine but I've many question mark chars '?' in the extracted text. > > For example the word *fluidi *is recognized as *?uidi *and much more > example... > > Does anyone know some tips in order to fix this behaviour? > > Thanks in advance, > Salvo. > > -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To post to this group, send email to [email protected]. > Visit this group at http://groups.google.com/group/tesseract-ocr. > To view this discussion on the web visit > https://groups.google.com/d/msgid/tesseract-ocr/dc3dc154-fc24-48d8-8f5e-4a1df7f36282%40googlegroups.com > <https://groups.google.com/d/msgid/tesseract-ocr/dc3dc154-fc24-48d8-8f5e-4a1df7f36282%40googlegroups.com?utm_medium=email&utm_source=footer> > . > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAJbzG8zVh-Hy7LjbFEvwDY%2BXMWcrzdRgvEx9AytenXiCw0c1hA%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.

