from:"Salvo Piazza"

Re: [tesseract-ocr] Many 'question mark' chars in recognized text

2014-10-17 Thread Salvo Piazza

there could be the same issue in Italian. If it is replaced with '?' I would guess you have problem with unicode... Can you check it with tesseract executable? Zdenko On Thu, Oct 16, 2014 at 10:18 AM, Salvo Piazza s.pi...@tsc-consulting.com javascript: wrote: Hi all, I've written a little

[tesseract-ocr] Many 'question mark' chars in recognized text

2014-10-16 Thread Salvo Piazza

Hi all, I've written a little simple program to extract text from image with tesseract 3.0.2 as: Tesseract instance = Tesseract.getInstance(); instance.setDatapath(currentDir); instance.setLanguage(ita); String returner = instance.doOCR(new File(filename)); It works fine but I've many question