import ctypesimport os
os.putenv("PATH", r'C:\Program Files\Tesseract-OCR')
os.environ["TESSDATA_PREFIX"] = r'C:\Program Files\Tesseract-OCR\tessdata'

liblept = ctypes.cdll.LoadLibrary('liblept-5.dll')
pix = liblept.pixRead('test.png'.encode())print(pix)

tesseractLib = ctypes.cdll.LoadLibrary('libtesseract-5.dll')

tesseractHandle = tesseractLib.TessBaseAPICreate()

tesseractLib.TessBaseAPIInit3(tesseractHandle, '.', 'eng')

tesseractLib.TessBaseAPISetImage2(tesseractHandle, pix)
# text_out = tesseractLib.TessBaseAPIGetUTF8Text(tesseractHandle)# 
print(ctypes.string_at(text_out))

tessPageIterator = 
tesseractLib.TessResultIteratorGetPageIterator(tesseractHandle)
iteratorLevel = 3  # RIL_BLOCK,  RIL_PARA,  RIL_TEXTLINE,  RIL_WORD,  RIL_SYMBOL
tesseractLib.TessPageIteratorBoundingBox(tessPageIterator, iteratorLevel, 
ctypes.c_int(0), ctypes.c_int(0), ctypes.c_int(0), ctypes.c_int(0))

I got exceptions :

Traceback (most recent call last):
  File "D:\BaiduYunDownload\programming\Python\CtypesOCR.py", line 25, in 
<module>
    tesseractLib.TessPageIteratorBoundingBox(tessPageIterator, iteratorLevel, 
ctypes.c_int(0), ctypes.c_int(0), ctypes.c_int(0), ctypes.c_int(0))OSError: 
exception: access violation reading 0x00000018

So what's wrong ? The aim of this program is to get bounding rectangle of 
each word. I know projects like tesserocr 
<https://github.com/sirfz/tesserocr> and PyOCR 
<https://gitlab.gnome.org/World/OpenPaperwork/pyocr>

P.S. Specifying the required argument types (function prototypes) for the 
DLL functions doesn't matter here. One could uncoment the commented lines 
and comment the last three lines to test it. 

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/6af57275-1518-4096-b640-641a69fa1398%40googlegroups.com.

Reply via email to